白菜: 2007

2007年12月18日星期二

工作后

转眼来到腾讯有5个月了，博客也就长个5个月的草了。 china-pub 又给我送了几本书来，自然也就又有一百多块离开了我的身边。拿到书的时候心情自然是不用说，不过想起床上躺着上个月的科幻世界没看完时，却也着实让人发愁。有时间的时候没钱，有钱的时候没时间，人生啊！！！还有可爱的python也是许久没有贡献新东西了，真是惭愧，唉~~~

2007年9月26日星期三

第一次见到翻译得这么好的编程之道了：http://livecn.huasing.org/tao_of_programming.htm Prince Wang's programmer was coding software. His fingers danced upon the keyboard. The program compiled without an error message, and the program ran like a gentle wind. Excellent!" the Prince exclaimed, "Your technique is faultless!" "Technique?" said the programmer, turning from his terminal, "What I follow is the Tao -- beyond all technique. When I first began to program I would see before me the whole program in one mass. After three years I no longer saw this mass. Instead, I used subroutines. But now I see nothing. My whole being exists in a formless void. My senses are idle. My spirit, free to work without a plan, follows its own instinct. In short, my program writes itself. True, sometimes there are difficult problems. I see them coming, I slow down, I watch silently. Then I change a single line of code and the difficulties vanish like puffs of idle smoke. I then compile the program. I sit still and let the joy of the work fill my being. I close my eyes for a moment and then log off." Price Wang said, "Would that all of my programmers were as wise!" 程序员为公子王写软件，指飞键舞，不差丝毫，行之如风。公子王曰：『嘻，善哉！技盖至此乎？』程序员释键对曰：『臣之所好者道也，进乎技矣。始臣之编程之时，所见无非程序者；三年之后，未尝见程序也，见其子程序也；方今之时，臣以神遇而不以目视，官知止而神欲行，因其固然，程序自写之。诚然，尝至于难者，吾见其难为，怵然为戒，视为止，行为迟，改其一字，謋然已解，如烟随风。使之编译，释键而坐，为之踌躇满志，闭目而log off之。』公子王曰：『吾之程序员皆如此，则其善焉！』

2007年6月26日星期二

实现一个 django 的 url dispatcher

心血来潮，实现了一个 django 的 url dispatcher，比想象中简单多了。 http://djangodispatcher.googlecode.com/svn/trunk/mapper.py http://djangodispatcher.googlecode.com/svn/trunk/test.py 实际实现功能的代码才2、30行，功能基本完整，包括分层次的url配置，和发生异常时帮助调试用的一些信息。 PS：发现最近爱上了 Test Driven.

2007年6月22日星期五

如何在醉酒的情况下编写正确的程序

答案很简单：Test Driven。哈哈，这个（http://code.google.com/p/pylifegame/）就是好例子！醉了，不多说了，自己看去，我要睡觉去了，嗯 ...

2007年6月20日星期三

Faint! 和我同名的编辑器

Yi is a text editor written and extensible in Haskell. The goal of Yi is to provide a flexible, powerful and correct editor core dynamically scriptable in Haskell.

原来 pickle 这么有意思

Pickle: An interesting stack language 原来 pickle 本身就是就是一个微型的基于栈的语言，呵呵，有点意思。研究一下 pickle.py 和 pickletools.py ，可以看到更细节的东西。

翻译了这篇文章

Python 3000 进度报告也可以从 guido 的中文 blog 看到：http://blog.csdn.net/gvanrossum/archive/2007/06/20/1658829.aspx

2007年6月19日星期二

Python 3000 Status Update (Long!)

Python 3000 Status Update (Long!) by Guido van Rossum """ Summary Here's a long-awaited update on where the Python 3000 project stands. We're looking at a modest two months of schedule slip, and many exciting new features. I'll be presenting this in person several times over the next two months. """

2007年6月8日星期五

老子毕业了

http://picasaweb.google.com/yi.codeplayer/070608

2007年6月3日星期日

SQLAlchemy Examples

看 SQLAlchemy 自带的 zblog 的例子，可以看到 SQLAlchemy 一些非常有用的特性。

文章相关评论数统计

比如显示文章列表的同时我们希望获得相关文章的评论数，如果是用 django 那就只能放弃 ORM 的好处自己去执行 sql 语句了，否则就只会导致 n+1 条 SQL 语句的执行。在 SQLAlchemy 中你可以把任意的 select 语句映射到一个 class ，这样就可以用一条 SQL 语句搞定，还能获得 ORM 的好处，下面是原封不动拷过来的代码（只调整了下格式）：

   # Post mapper, these are posts within a blog.
  # since we want the count of comments for each post,
  # create a select that will get the posts
  # and count the comments in one query.
  posts_with_ccount = select(
      [c for c in tables.posts.c if c.key != 'body'] + [
          func.count(tables.comments.c.comment_id).label('comment_count')
      ],
      from_obj = [
          outerjoin(tables.posts, tables.comments)
      ],
      group_by=[
          c for c in tables.posts.c if c.key != 'body'
      ]
      ) .alias('postswcount')

  # then create a Post mapper on that query.
  # we have the body as "deferred" so that it loads only when needed,
  # the user as a Lazy load, since the lazy load will run only once per user and
  # its usually only one user's posts is needed per page,
  # the owning blog is a lazy load since its also probably loaded into the identity map
  # already, and topics is an eager load since that query has to be done per post in any
  # case.
  mapper(Post, posts_with_ccount, properties={
      'id':posts_with_ccount.c.post_id,
      'body':deferred(tables.posts.c.body),
      'user':relation(user.User, lazy=True,
               backref=backref('posts', cascade="all, delete-orphan")),
      'blog':relation(Blog, lazy=True,
               backref=backref('posts', cascade="all, delete-orphan")),
      'topics':relation(TopicAssociation, lazy=False, private=True,
               association=Topic, backref='post')
  }, order_by=[desc(posts_with_ccount.c.datetime)])

树形评论 映射如下：

   # comment mapper.  This mapper is handling a hierarchical relationship on itself,
  # and contains
  # a lazy reference both to its parent comment and its list of child comments.
  mapper(Comment, tables.comments, properties={
      'id':tables.comments.c.comment_id,
      'post':relation(Post, lazy=True,
               backref=backref('comments', cascade="all, delete-orphan")),
      'user':relation(user.User, lazy=False,
               backref=backref('comments', cascade="all, delete-orphan")),
      'parent':relation(Comment,
               primaryjoin=tables.comments.c.parent_comment_id==tables.comments.c.comment_id,
               foreignkey=tables.comments.c.comment_id, lazy=True, uselist=False),
      'replies':relation(Comment,
               primaryjoin=tables.comments.c.parent_comment_id==tables.comments.c.comment_id,
               lazy=True, uselist=True, cascade="all"),
  })

很多时候我们需要一次性获取对应一个文章的所有评论，可以用一条 select 先把数据取出，然后手动建立树形结构：

# we define one special find-by for the comments of a post, which is going to make its own
# "noload" mapper and organize the comments into their correct hierarchy in one pass. hierarchical
# data normally needs to be loaded by separate queries for each set of children, unless you
# use a proprietary extension like CONNECT BY.
def find_by_post(post):
  """returns a hierarchical collection of comments based on a given criterion.
  uses a mapper that does not lazy load replies or parents, and instead
  organizes comments into a hierarchical tree when the result is produced.
  """
  q = session().query(Comment).options(noload('replies'), noload('parent'))
  comments = q.select_by(post_id=post.id)
  result = []
  d = {}
  for c in comments:
      d[c.id] = c
      if c.parent_comment_id is None:
          result.append(c)
          c.parent=None
      else:
          parent = d[c.parent_comment_id]
          parent.replies.append(c)
          c.parent = parent
  return result

Comment.find_by_post = staticmethod(find_by_post)

2007年5月30日星期三

multitask and Hive

multitask multitask allows Python programs to use generators (aka coroutines) to perform cooperative multitasking and asynchronous I/O. Applications written using multitask consist of a set of cooperating tasks that yield to a shared task manager whenever they perform a (potentially) blocking operation, such as I/O on a socket or getting data from a queue. The task manager temporarily suspends the task (allowing other tasks to run in the meantime) and then restarts it when the blocking operation is complete. Such an approach is suitable for applications that would otherwise have to use select() and/or multiple threads to achieve concurrency. Producer/Consumer with multitask library Hive This is a basic concurrency module that uses only dependencies available in the Python 2.5 standard library. It allows the creation of a jobfile for uses to queue work that any number of worker processes with access to the jobfile can pull from the queue and run. 看到这两个库很快就联想到曾经写过的那段代码，python2.5 增强的 yield 表达式所蕴涵的 continuation 的能力似乎终于有人拿它来发挥点实际作用了。

Polymorphic Associations in Rails

Polymorphic Associations with SQLAlchemy SQLAlchemy 老大展示怎么用 sqlalchemy 实现 rails 的 Polymorphic Associations 顺便看了下 Rails 所谓Polymorphic Associations 的介绍，才发现其实就是我很早就介绍过的 django 的 content-type app 所干的事情，app 就是插件的意思。

2007年5月26日星期六

Python and vim: Two great tastes that go great together

Python and vim: Two great tastes that go great together 用 python 扩展 vim ，想法倒是不新，不过第一次看到 tutotial 。记得 tocer 说过要用 python 写个 vim 库的，不知道有没有进展哈，呵呵。

Evolution of a Python programmer

http://dis.4chan.org/read/prog/1180084983/ 哈哈，有点意思，再加一个： Python 2.5 programmer:

def fact(x):
    return x * fact(x - 1) if x > 1 else 1
print fact(6)

2007年5月22日星期二

Elixir Examples

有的时候在 blog 里写 wiki，有的时候在 wiki 里写 blog，有的时候在 blog 里发在 wiki 里写的 blog ;-)

多重继承真是好哇

写 model 的时候发现有些东西在重复，第一反应就是写个基类，把这些重复的东西拿出来，然而 Model 类之间继承却不是那么方便的，应该会影响到 ORM 的行为。怎么办呢？幸好 python 有多重继承。下面就是项目中做这些重复事情的类：

class ModelMixin(object):
    def save(self):
        if not self.id: # creation time
            if hasattr(self, 'pubdate'):
                self.pubdate = datetime.now()
            if hasattr(self, 'pubtime'):
                self.pubtime = datetime.now()

        if hasattr(self, 'updatedate'):
            self.updatedate = datetime.now()
        if hasattr(self, 'updatetime'):
            self.updatetime = datetime.now()
        if hasattr(self, 'number'): # 今天第几次发布
            self.number = self.__class__.objects.filter(pubdate=datetime.now()).count()+1

        if hasattr(self, 'before_save'):
            self.before_save()
        super(ModelMixin, self).save()
        if hasattr(self, 'after_save'):
            self.after_save()

注意：django 将废弃 auto_add 和 auto_now 这些东西，认为太 magic ，建议在 save 中处理，所以上面这个类就更有用了。怎么用呢？

class Product(ModelMixin, models.Model):
    pubdate = models.DateField(u'...', editable=False)
    number = models.IntegerField(u'...', editable=False)
    ...

这样 pubdate 和 number 自然就有了相应的含义了。另外 ModelMixin 还定义了 before_save 和 after_save 的钩子，具体 model 可以在这两个方法里放点代码，比如：

    def before_save(self):
        self.totalprice = self.count * self.product.unitprice

...

    def after_save(self):
        if self._create:
            p = OutProduct(postuser=self.postuser,count=1,
                    pubdate=self.pubdate,mainproduct=self)
            p.save()

这些都是项目中直接拷出来的代码，具体意思你就慢慢猜吧，呵呵。多重继承的实现其实是个还算复杂的过程，复杂的多重继承也会产生一些奇特的行为，不过基本上只要遵守一些良好的习惯（比如常用 super ，虽然写起来有些繁琐），了解一些多重继承的基本原理，基本上不会遇到什么奇怪的问题了。关于 python 多重继承的实现，请看：The Python 2.3 Mehod Resolution Order

newforms 太好用了

建一个项目 newformstutorials ，建一个 app blog ，在 blog 的 models 中定义个：

class Article(models.Model):
    title = models.CharField(u'标题', maxlength=255)
    author = models.CharField(u'作者', maxlength=20)
    hits = models.IntegerField(u'点击数', default=0, editable=False)
    content = models.TextField(u'内容')

配置好数据库，把 newformstutorials.blog 加到 INSTALLED_APPS，manage.py syncdb，然后 manage.py shell ，然后：

In [1]: import django.newforms as forms

In [2]: from newformstutorials.blog.models import Article

In [3]: ArticleForm = forms.form_for_model(Article)

In [4]: form = ArticleForm()

In [5]: print unicode(form)
<tr><th><label for="id_title">标题:</label></th><td><input id="id_title" type="t
ext" name="title" maxlength="255" /></td></tr>
<tr><th><label for="id_author">作者:</label></th><td><input id="id_author" type=
"text" name="author" maxlength="20" /></td></tr>
<tr><th><label for="id_content">内容:</label></th><td><textarea id="id_content"
rows="10" cols="40" name="content"></textarea></td></tr>

In [6]: print form.as_ul()
<li><label for="id_title">标题:</label> <input id="id_title" type="text" name="t
itle" maxlength="255" /></li>
<li><label for="id_author">作者:</label> <input id="id_author" type="text" name=
"author" maxlength="20" /></li>
<li><label for="id_content">内容:</label> <textarea id="id_content" rows="10" co
ls="40" name="content"></textarea></li>

一个空白的 form 就这样出来了，这就是个添加文章的表单，让我们用这个表单来加点数据吧：

In [7]: form = ArticleForm({'title':'some title','author':'huangyi'})

In [8]: form.is_valid()
Out[8]: False

In [9]: form.errors
Out[9]: {'content': [u'This field is required.']}

In [10]: form = ArticleForm({'title':'some title','author':'huangyi', 'content':
'some contents...'})

In [11]: form.is_valid()
Out[11]: True

In [12]: article = form.save(commit=True)

OK，数据就这样保存了，我们再来试试数据更新的页面吧：

In [13]: ChangeForm = forms.form_for_instance(article)

In [14]: form = ChangeForm()

In [15]: print unicode(form)
<tr><th><label for="id_title">标题:</label></th><td><input id="id_title" type="t
ext" name="title" value="some title" maxlength="255" /></td></tr>
<tr><th><label for="id_author">作者:</label></th><td><input id="id_author" type=
"text" name="author" value="huangyi" maxlength="20" /></td></tr>
<tr><th><label for="id_content">内容:</label></th><td><textarea id="id_content"
rows="10" cols="40" name="content">some contents...</textarea></td></tr>

In [16]: form = ChangeForm({'title':'another title', 'author':'huangyi', 'conten
t':'other contents...'})
In [17]: form.is_valid()
Out[17]: True

In [18]: form.save()
Out[18]: <Article: Article object>

In [19]: article = Article.objects.get(id=article.id)

In [20]: article.title
Out[20]: 'another title'

2007年5月21日星期一

django newforms admin

又用 django 做了个项目，因为主要都是后台的东西，所以决定启用 django 的 newforms admin 分支！(不过这里我不是推荐大家现在就开始用 newforms admin 分支，如果没有把握，最好是抱着玩玩的态度先，我在开发过程中就改掉它好几个bug) newforms admin 分支是用 newforms 来重构 admin 模块，也顺便改变了一些设计决策，大大增强了 admin 的可定制性。首先 newforms 的应用，成功分离了 db field、form field、widget 三个部分，db field 属于 ORM ，主要负责 model 相关的事务，form field 主要处理用户输入数据的验证，widget 负责渲染ui，似乎这里面还透着 MVC 的影子呢 ;-) newforms admin中可以方便地对 widget 进行替换，怎一个爽字了得。另外，新的 admin 把 admin 部分的定义从 model 中分离出来了，似乎写起来要麻烦点，不过好处也是显而易见的，首先是 model 定义更整洁了，其次新的 admin 设计成了一种重用性更好的形式，用得好的话还能省下不少代码呢，而且能够完成一些以前的 admin 很难完成的任务。新 admin 的核心在于 AdminSite 和 AdminModel，AdminSite 负责一些全局性的事务，比如首页，用户登录登出改密码权限控制，和model的注册，AdminModel 负责单个 model 的相关管理页面。这样做的好处是你可以继承这两个类，覆盖掉一些合适的方法，你基本上可以为所欲为。比如，我在这个项目中就写了这么几个自定义的 admin 类：

class CustomAdmin(admin.ModelAdmin):
    def before_save(self, request, instance, form, change=False):
        pass

    def save_add(self, request, model, form, post_url_continue):
        def custom_save(form, commit=False):
            instance = model()
            new_object = forms.save_instance(form, instance,
                    fail_message='created', commit=False)
            self.before_save(request, new_object, form)
            if commit:
                new_object.save()
                for f in model._meta.many_to_many:
                    if f.name in form.cleaned_data:
                        setattr(new_object, f.attname, form.cleaned_data[f.name])
            return new_object
        form.__class__.save = custom_save
        return super(CustomAdmin, self).save_add(request, model, form,
                    post_url_continue)

    def save_change(self, request, model, form):
        def custom_save(form, commit=False):
            from copy import copy
            new_object = forms.save_instance(form,
                    copy(form.original_object),
                    fail_message='changed', commit=False)
            self.before_save(request, new_object, form, change=True)
            if commit:
                new_object.save()
                for f in model._meta.many_to_many:
                    if f.name in form.cleaned_data:
                        setattr(new_object, f.attname, form.cleaned_data[f.name])
            return new_object
        form.__class__.save = custom_save
        return super(CustomAdmin, self).save_change(request, model, form)

大家应该可以看得出来，这个 admin 提供了 before_save 的钩子（当然你也可以提供 after_save 不过我这里暂时只需要 before_save），你可以继承它然后在这个方法里写些代码，就得在 model 保存之前得到执行。你可能要问，为什么不直接定义 Model 的 save 方法呢？答案很简单 Model 不知道 request 和 form 的存在！在 before_save 中你就可以做些很有意思的事情了，比如自动把 model 中某个字段设置成当前登录用户！这个定制需求其实很早就提出来了，以前的解决方案是写个 middleware 把 request 放到 threadlocal 中去，然后在 model 中通过 threadlocal 获取当前请求的 request ，能用，但是很麻烦也很丑。现在用这个 before_save 可以轻松实现：

class AutoUserAdmin(CustomAdmin):
    user_field_name = 'postuser'
    def before_save(self, request, instance, form, change=False):
        setattr(instance, self.user_field_name, request.user)
        super(AutoUserAdmin, self).before_save(request, instance, form, change)

当然你也可以继承这个 AutoUserAdmin ，写上你自己的 user_field_name ，太简单了。还有一个常见的定制需求就是限制登录用户只能看到自己发布的信息，看不到更不能修改别人发布的信息。在上面这个 AutoUserAdmin 的基础上做：

class RestrictUserAdmin(AutoUserAdmin):
    def queryset(self, request):
        queries = {self.user_field_name:request.user}
        return super(RestrictUserAdmin, self).queryset(request).\
                filter(**queries)

是不是超简单？呵呵。另外别忘了 python 还支持传说中的多重继承，意味着你可以同时继承多个 admin 类，并拥有多个 admin 类的组合功能。比如我这里定制了一个支持文件上传的 admin（newforms 和 newforms admin 暂时都还没有把文件上传相关的东西加进去，只能自己写），我把它叫做 FileUploadAdmin ，现在我希望我的 admin 能同时拥有 RestrictUserAdmin 和 FileUploadAdmin 的功能，没问题：

class CommonAdmin(FileUploadAdmin, RestrictUserAdmin):
    date_hierarchy = 'pubdate'
    list_per_page = 15
    ordering = ('-id',)

当然我还在里面定义了一些通用的（当然是对于我自己的项目来说） admin 配置。然后怎么把这些 admin 应用到 model 上去呢？

class ProductAdmin(CommonAdmin):
    list_display = ('__str__', 'type', 'unitname', 'unitprice',
        'qsinfo', 'postuser', 'pubdate', 'image_view')
    list_filter = ('type', 'pubdate')
)
admin.site.register(Product, ProductAdmin)

上面的代码虽然不错，不过我还是嫌麻烦，实际上我是这么写的：

admin.site.register(Product,
    CommonAdmin,
    list_display = ('__str__', 'type', 'unitname', 'unitprice',
        'qsinfo', 'postuser', 'pubdate', 'image_view'),
    list_filter = ('type', 'pubdate'),
    section_name = '通用',
)

不过要让上面的代码正常运行，还需要对 django newforms admin 分支的代码做一点小改动才行，在文件 django/contrib/admin/sites.py 中大约 73 行的位置：

          # TODO: Handle options

的下面加上：

          # it works
        if options:
            admin_class = type(admin_class.__name__, (admin_class,),
                    options)

实际上，使用 django 乃至 python 最大的快乐就是别人写的代码你都可以轻松看懂，这难道不是作为程序员最大的快乐吗？ ;-) 如果你现在开始用 django newforms admin 分支的话，估计遇到的大部分问题都是和 unicode 有关（因为我遇到的就是这样的），这是因为目前 django 的开发和 python 本身的开发一样，都处在整体向 unicode 迁移的过程之中，当前最大的矛盾就是 ORM 使用的是普通字符串(也就是 python3000中所谓字节数组)，而 newforms 却开始整体使用 unicode 了，这常常带来麻烦。如果你在基于 django 最新的 svn 版本开发，那一定要看一下 Unicode 分支了，里面说到了如何使让你的程序顺利过渡到 unicode ，祝大家过渡快乐 ;-)

2007年5月15日星期二

好久没写blog了

好久没写 blog 了，现在毕业论文终于敲定，可以长舒一口气 ;-)

写论文的时候，rst 可真是帮我不少忙，不过学校规定得交 doc 的版本，包括论文的格式什么的都是针对 msword 来说的，只好用 rst 生成 html ，然后拷贝到 msword。看到同学直接在 msword 里写论文，最后再痛苦地调整格式，窃喜 ;-)

继续我们可爱的python的写作，不过写到现在感觉自己还是不习惯写入门的东西，总是不自觉地想用最简洁的语言把所有东西都说出来（不过这倒符合python的哲学 = =" ），只好努力压下许多 python 的好东东了。

2007年5月1日星期二

python3000与接口

""" 要编写复杂软件又不至于一败涂地的唯一方法就是用定义清晰的接口把若干简单模块组合起来。 """ 抽象的说，其实接口、契约、协议、界面等等概念说的差不多都是一个意思。作为如此流行的被大规模使用的语言，python 一直没有这么个东西的标准实现，实在是一种遗憾，当然第三方的实现 zope.interface 其实早就在 zope 和 twisted 中大量应用了。 python 3000 中正在讨论的： pep 3119 Introducing Abstract Base Classes； pep 3124 Overloading, Generic Functions, Interfaces, and Adaptation；（还有 pep 3133 Introducing Roles ，不过暂时没看出它和 Abstracet Base Class 有啥大区别）希望向 python 中引入一些类型约束的能力，配合上已经被接受的 PEP 3107 Function Annotations 提议的语法，真是对 python 非常好的补充。甚至对其他动态语言也是非常好的一个启示！另外，啄木鸟上有 python3000 页面，欢迎大家在上面添加自己的想法 ;-)

2007年4月2日星期一

新的 pep ！

虽然我发晚了一点，但还是很值得一看的 PEP ;-)

2007年4月1日星期日

突破 gwf 的封锁访问 blogger 的通用方法

根据这篇文章给出的好办法，据说 ie 下也可以用。简单得说就是先下载这个代理文件，比如说你保存在 c:/proxy.pac，然后在 firefox 中选项 -> 高级 -> 连接配置 -> 自动配置代理url ，填入 file:///c:/proxy.pac 。如果有任何疑问看上面的连接以获得详细的配置办法！

2007年3月30日星期五

trying out PyPy

对 pypy 的简单试验，惊叹其优化的智能！

2007年3月27日星期二

[有趣]Invasion Of The Dynamic Language Weenies

Invasion Of The Dynamic Language Weenies 这文章很是耐人寻味，嘿嘿 ;-)

2007年3月18日星期日

django and non programmers

看了Are You Generic?，Django for non-programmers 两篇文章。 django 真是设计人员的福音啊！

2007年3月17日星期六

字典与动态语言

字典（或者叫哈希表、关联数组..）与动态语言的渊源可谓极深。动态语言之所以动态，归根结底是因为把对变量的求值放在了运行时完成而非静态语言的编译时确定。动态语言程序中众多的不同层次的名字空间（或者说作用范围）其实就是一个个的字典，变量名为 key，对象为 value。而对变量求值的过程就是对所在名字空间查找的过程，根据变量名，找出相应的对象，有时候在局部名字空间中没找到，还可能会自动跑到外部名字空间或是全局名字空间去找。对于支持 OO 的动态语言，对对象的实现其实也都是字典，属性名为 key，属性值为 value ，属性的获取也就变成了对字典的查找。有时子类中找不到还会到父类中去找，这也就是动态语言对继承的实现。 javascript 的 prototype 可能是动态语言实现继承最直接最简洁的方式了。python 为 OO 加了几个新语法，新概念，还有对多重继承的支持，不过本质上其实还差不多。字典的核心地位在 lua、javascript 中表现得最为明显，在 javascript 中字典和 object 其实就是同义词；在 python 中其实也不难找到字典的身影：locals()、globals()、还有(几乎)所有对象都有的 __dict__ 属性；ruby 这样的语言中这一点会藏得深一些。【以下为个人感受】字典是动态语言的灵魂，要使用好动态语言首先肯定是要认清这一点的，不过在实际软件开发中像 lua 一样直面字典编程未免太简陋了一些，javascript 稍微好点，python 完美，而 ruby 过了。

2007年3月12日星期一

pypy 介绍

前面写过篇介绍 pypy 的文章了，不过感觉有些东西还说得不够清楚也不够准确。 pypy 分为两部分：一个 python 的实现和一个编译器。 pypy 这名字说的就是这第一部分：用python实现的python。但其实这么说并不准确，准确得说应该是用 rpython 实现的 python ，rpython 是 python 的一个子集，不过不要搞混了，虽然 rpython 不是完整的 python ，但用 rpython 写的这个 python 实现却是可以解释完整的 python 语言。那为什么要用 rpython 来写这个 python 实现呢，这其实就涉及到了 pypy 的第二部分：编译器。这是一个编译 rpython 的编译器，或者说这个编译器有一个 rpython 的前端，目前也只有这么一个前端，不过它的后端却是不少，也就是说这个编译器支持许多的目标语言，比较重要的有：c, cli, javascript ... 而当我们把这两部分合起来看的时候，就能够发现 pypy 最重大的意义所在，当我们用这个编译器来编译这个用 rpython 写的 python 实现，我们能够得到什么呢？一个 c 写的 python 实现，一个用 .net 写的 python 实现（不过目前的 cli 后端还不能用来编译这个 python 实现） ... 我想这个介绍应该是比较简要了，pypy 的这两个大部分中都包含有许多有趣的内容，等玩得多些了再来介绍吧。 [update] 刚看到 pypy 0.99 发布的声明，其中说到编译后的解释器性能 twice the speed of the 0.9 release, overall 2-3 slower than CPython 。并且： It is now possible to translate the PyPy interpreter to run on the .NET platform . the JavaScript backend has evolved to a point where it can be used to write AJAX web applications with it. WOW!

2007年3月9日星期五

pythonic cherrypy

刚才看到这个页面：http://tools.cherrypy.org/wiki/InteractiveInterpreter，发现 cherrypy 还可以这样做，真是有点意思，正如作者所说： We think it showcases the pythonic nature of CherryPy. 不过那个视频使用的是cherrypy以前的版本，cherrypy3稍有不同，以下是我使用 cherrypy3 在 ipython 上实验的结果：

Python 2.4.4 Stackless 3.1b3 060516 (#71, Jan 27 2007, 21:48:58) [MSC v.1310 32
bit (Intel)]
Type "copyright", "credits" or "license" for more information.

IPython 0.7.3 -- An enhanced Interactive Python.
?       -> Introduction to IPython's features.
%magic  -> Information about IPython's 'magic' % functions.
help    -> Python's own help system.
object? -> Details about 'object'. ?object also works, ?? prints more.

In [1]: import cherrypy

In [2]: cherrypy.config.update({
  ...: 'autoreload.on':False,
  ...: 'server.log_to_screen':False
  ...: })

In [3]: class Hello(object):
  ...:     @cherrypy.expose
  ...:     def index(self):
  ...:         return 'hello world!'
  ...:     @cherrypy.expose
  ...:     def test(self):
  ...:         yield 'test1'
  ...:         yield 'test2'
  ...:

In [4]: hello = Hello()

In [5]: cherrypy.tree.mount(hello, '/')
Out[5]: <cherrypy._cptree.Application object at 0x00E2C0F0>

In [6]: cherrypy.engine.start(blocking=False)
CherryPy Checker:
The Application mounted at '' has an empty config.


In [7]: cherrypy.server.quickstart()
[09/Mar/2007:21:01:40] HTTP Serving HTTP on http://0.0.0.0:8080/

# 注释：此时可以访问 http://localhost:8080/ 和 http://localhost:8080/test 了。

In [8]: def test2(self):
  ...:     return 'test2'
  ...:

In [9]: Hello.test2 = cherrypy.expose(test2)
# 注释：此时可以访问 http://localhost:8080/test2 了！

真是方便那！

2007年2月7日星期三

强大的 sqlalchemy

sqlalchemy 的文档可谓典范，谁叫作者还开发着模板语言（myghty、mako）呢，呵呵。其实 sqlalchemy 的文档就是用 myghty 写的。

不过系统复杂了，功能多了，再好的文档也会让人迷路。最近用了用 sqlalchemy ，对这一点感受颇深，故把临时想到的几个比较常用的功能摘录如下，提纲挈领，既为自己整理一下思路，也让新手一窥 sqlalchemy 的精华。

Eager Loading Join，本是关系数据库中多么常见的操作，怎奈 django 的 orm 就是不支持，SQLObject 的做法也很不如人意。
Association Object many-to-many 关系都是通过增加一个中间表来实现，映射到对象后，这个中间表就不需要我们再操心了，会隐式地进行处理。不过对于多个实体两两之间多对多关系，往往另外再增加一个关联对象会更方便。这样的例子其实也不少，比如：user-bookmark-tags、产品-元件-元件供应商(这是一次期末考试题目里面的 ;-)
Deferred Column Loading 比如文章表里面的 body 字段通常比较大，在获取文章列表时这个字段就不必取出来了。甚至如果你有某个字段存的是文件的话，这个功能就更加重要了。这本是个不起眼的小功能，不过上次看到 javaeye 中有一贴说到大名鼎鼎的 Hibernate 都对这个功能实现得这么痛苦后，我蓦然发现 sa 真的很 nb。呵呵，托了动态语言的福了吧。
Mapping a Class with Table Inheritance 如何把对象间继承关系映射到关系数据库，sqlalchemy 提供三种方式： single table inheritance 所有子类型都放在一个表中； concrete table inheritance 每一种子类型存在独立的表中； multiple table inheritance 父子类型都存在独立的表中，查询的时候进行连接；显然最后一种是冗余最少的，不过查询的时候要做一次连接操作，如何选择还是看具体情况了。
Mapping a Class against Arbitary Selects 将对象映射到任意的 select，其实也就是任意的 sql 子查询。这功能太强大了，有了这个后，我们就可以骄傲地宣称，(几乎)没有什么是 sqlalchemy 做不了的了！
Identity Map session 在 sqlalchemy 中是一个非常重要的概念，session 跟踪对象的修改情况，跟踪对象之间的关联，智能判断数据库操作执行的顺序等等。 Identity Map 是 session 中一个容易让人掉入陷阱的概念，你可以把它想象成一个以数据表主键为key的cache。每次从数据库查询后，如果 sqlalchemy 发现 Identity Map 中已经有了相同主键的实例，那就不会重新生成实例了。因为如果存在多个实例会带来许多问题，比如多个实例分别修改并保存时就会产生混乱。偶尔 Identity Map 也会产生一些意想不到的行为，比如 ticket 458 ，不过理解了 Identity Map 的机理后，也就没什么问题了。值得一提的是，Mapper Options 有一个 always_refresh 参数，如果把它设为True，则对该 mapper 的任何查询操作都会自动使用从数据库中查询到的数据覆盖 Identity Map 中已有的实例，这样要是对旧实例做过什么还没保存的修改的话，就都没了。所以要慎用！
Cascade rules 最后这个也是很有用的功能，举个例子来说吧，user 和 article 有一对多的关系，现在删除一个 user，是否应该把相关的 article 也删了呢，要 article 还有其他的依赖关系呢？这些决定当然是要根据实际的需求来，而控制这些行为的方法就是通过 relation 的 cascade 参数，具体取值及其含义看文档去吧。

总滴来说，本文只是个提纲的作用，具体还得去看文档，看示例，看unittest。

最后还想说两句的就是，大家之所以选择 ORM ，主要原因是逃离 SQL，然而我感觉不能掌握 SQL 是不能(很好)掌握sqlalchemy的。至少要对关系数据库的这些概念了解，理解 SQL 就是理解关系数据库。只有这样才能利用sqlalchemy将关系数据库发挥到极致！

使用 sqlalchemy 的好处就是不用写 sql 了，屏蔽不同dbms之间SQL语法的区别，同时又让你在需要的时候能够利用到不同 DBMS 提供的一些独特特性，让你以对象的方式管理数据库访问代码，提高代码重用性！

Introducing Duck Typing

用 google docs 写的：http://docs.google.com/View?docid=dczg8vtk_18gxgvgq&revision=_published

2007年2月6日星期二

[豆瓣九点] 博客确认帖

doubanclaim8bca4134ae01e52b 白菜的页面在这里：http://9.douban.com/subject/9031109/ 说实话，还不太会玩这个豆瓣九点，研究研究先。

2007年2月5日星期一

Deploying Django

Django Book Chapter 21: Deploying Django 肯定有许多人对这章的内容感兴趣 ;-) 这一章首先介绍了 django “Shared nothing”的设计哲学，这是django可扩展性的源泉。随后介绍了他们比较偏爱的典型配置：

操作系统用 Linux——特别是Ubuntu。
web 服务器用 Apache 和 mod_python。
数据库服务器用 PostgreSQL。

随后介绍如何配置 apache、mod_python 和你的django应用。教你如何在一个apache上部署多个django应用，如何把 mod_python 用做开发服务器，如何处理静态文件，如何处理错误等等。

随后还介绍了使用 fastcgi 方式部署 django 应用，不过这部分我不太感兴趣，就直接跳过去了。

然后，便是万众期待的 Scaling 了！一图胜千言：

最后还有很重要的一部分，调优，不过说来说去也还是那么几条了：

多买内存
关闭 Keep-Alive ，不过这一点只是大部分情况而已，具体还得看你网站提供的功能。
使用 memcached
积极参加各个开源产品的社区

ps: 有些日子没写blog了，刚考完，心一下就野了，什么计划都忘了，写一篇来凑凑数目 ;-)

2007年1月27日星期六

使用 python 模拟 ruby 的 open class

老早就写了这些代码，但一直懒得为它写篇博客，我觉得我永远也无法理解为什么会有人发明这种奇怪的东西。不过终于还是决定写一篇吧，多一点有意思的代码也许能吸引更多人对 python 的兴趣呢，呵呵。虽然我对 ruby 的这个东西有许多贬义词想说，不过想想既然有人用，也就应该有其理由吧。且看代码：

def update( klass, bases, attrs ):
    for k,v in attrs.items():
        if not k.startswith('__') or not k.endswith('__'):
            setattr(klass, k, v)
    if bases:
        klass.__bases__ = bases
    return klass

class Meta(type):
    def __new__(cls, klass, bases, attrs):
        try:
            return update( globals()[klass], bases, attrs )
        except KeyError:
            return type.__new__(cls, klass, bases, attrs)

# test
__metaclass__ = Meta

# test simple
class A:
    def say(self):
        print 'hi'

a = A()
a.say() # hi

class A:
    def say(self):
        print 'ho'
    def new_func(self):
        print 'new'

a.say() # ho
a.new_func() # new

# test inherit
#del A
#class A:
    #def say(self):
        #print 'hi'

#a = A()
#a.say() # hi

#class B:
    #def say(self):
        #print 'ho'

#class A(B):
    #def say(self):
        #super(A, self).say()

#a.say() # ho

update: 很遗憾，测试发现 new style class 貌似还有个 bug 。所以把后面部分注释了先，不知道 python2.5 怎么样。

2007年1月26日星期五

用 jquery 加了点 js 效果

代码超简单：

$(function(){
  $(".box h2").css('cursor', 'pointer');
  $(".box h2").click(
    function(e){
      $(this).next().toggle();
    }
  ).click();
  $(".post-title").css('cursor', 'pointer');
  $(".post-title").click(
    function(e){
      $(this).next().next().toggle();
    }
  ).click();
  $('#readerpublishermodule0 h3').css('cursor', 'pointer');
  $('#readerpublishermodule0 h3').click(
    function(e){
      $(this).next().toggle();
    }
  ).click();
})

2007年1月24日星期三

ie sucks

偶然的机会，在 ie 下看了一下今天的成果，才发现右边的 widgets 周围没有圆角，郁闷。只能是建议大家用 firefox 看了 ;-) 另外由于在 ie 下过长的单词会把 div 撑开，又不想用固定宽度，所以没办法，只好动用了 word-break: break-all; 也就是在需要换行的地方会把一个单词拆开，看的时候确实很不爽，不过没办法，在 firefox 下看就好了。要是你知道有什么好的 ie 下的解决方法，不妨告之，先谢了 ;-)

added google reader clip

用 google reader 这么久，才发现它有这功能，看右边的 Starred Google Reader，那是我所有加过星的条目。另外看 google reader clip 的 css 效果不错，赶紧抄过来，现在右边全是这风格了，爽 ;-) 是不是好像和其他部分有点不太协调，不过要是全部搞成绿色的风格又觉得不太好，不管怎么说比以前还是好多了。

2007年1月22日星期一

intergrate genshi with django

写了个程序，用来在 django 中使用 genshi 模版： http://huangyilib.googlecode.com/svn/trunk/mashi_django/genshi_django.py

配置文件中通过元组 GENSHI_TEMPLATE_DIRS 指定模版存放路径；
会自动到已安装的 app 下的 genshi_templates 目录找模版文件；
DEBUG 为 True 时，启动模版的 auto_reload，否则关闭；

genshi 是个好模版，希望大家会喜欢。加上 genshi_django 后，就是名副其实的 mashi_django 了。发现什么问题，记得告诉我 ;-)

intergrate mako with django

写了个程序，用来在 django 中使用 mako 模版： http://huangyilib.googlecode.com/svn/trunk/mashi_django/mako_django.py

配置文件中通过元组 MAKO_TEMPLATE_DIRS 指定模版存放路径；
另外自动到所有安装过的 app 下的 mako_templates 目录下找模版；
模版编译后的 python 代码默认和相应模版文件放在一个目录下面，然后在模版文件的文件名后面加 ‘.py’，你可以通过配置 MAKO_MODULENAME_CALLABLE callable 对象来定义你自己的 module 文件名生成方式，这个功能来源于 mako ticket 14 ，好像这是我第一个 ticket ;-)
如果在配置文件中指定 MAKO_MODULE_DIR 的话，所有编译后的 python 代码都会存到这一个目录里来。

mako 是个好模版，希望大家会喜欢。

[news] django moving towards 1.0

都是些好消息 ;-)

There’s a lot of different things that “1.0” can mean. In many cases the label refers to some arbitrary measure of code maturity, but that’s usually very indistinct. There’s quite a bit of “1.0” software that’s far less robust than Django was at day 1; we could have called it “1.0” then and gotten away with it, I think.

In the context of Django, though, 1.0 has always meant something more concrete: forwards compatibility. Once we tag something as 1.0, we’re committing to maintaining API stability as described in the contributing HOWTO (http://www.djangoproject.com/documentation/contributing/#official-releases).

——摘自邮件列表

这个页面列出的是很可能会在 1.0 中出现的特性，还在讨论中。最近 django 还定了一套新的 ticket 管理流程，并且组织了一个 4 人的 ticket 管理队伍：

The last, most important, piece of the puzzle, is that we now have official ticket managers, a group of volunteers who work together to manage ticket metadata and otherwise streamline the process. Although anyone can -- and is encouraged to -- help out keeping tickets organized, these folks have volunteered to take ownership of the ticket tracker in the long term. Please welcome Chris Beaven (SmileyChris), Simon Greenhill, Michael Radziej and Gary Wilson!

另外现在还有了一名专门的 release 管理员，并且最近发布了 django 0.95.1。

2007年1月19日星期五

do it runtime

第一次从静态语言到动态语言的人肯定在思维上需要一个比较大的跳跃，主要是许多静态语言中编译器干的事情到动态语言中后，或是不存在了，或是需要在运行时进行。典型的例子包括：类型检查，重载，访问控制，常量。（暂时就想到这几个，还有一些代码生成的技术像define、template我们就不提了） 1、类型检查。

对于类型检查我想大部分人倾向于可选地进行，毕竟动态语言不是静态语言，duck typing还是给动态语言带来了巨大的灵活性的。 python对类型检查的实现只搜到这么一个：http://oakwinter.com/code/typecheck/ ，粗略看了一下文档，似乎已经相当完善了。

而我自己出于学习的目的也写了个超级简单的：http://huangyilib.googlecode.com/svn/trunk/typecheck.py，这个代码做为学习的材料也还是不错的。而且写完这个我自己也感觉对python的函数参数的处理机制有了更完善的认识。

给大家看下测试输出先，从中大家可以一窥其功能：

call temp(1, 'hello')
call temp(1, 'hello', c=4)
call temp(1, c=4, b='hello')
call temp(a=1, c=4, b='hello')

call temp(1, 2)
TypecheckError : the value 2 of argument 'b' is not type <type 'str'>

call temp(1, 'hello', c='hello')
TypecheckError : the value 'hello' of argument 'c' is not type <type 'int'>

call temp(1, c=1)
TypeError : temp() takes at least 2 non-keyword arguments (1 given)

temp() has not this keyword argument 'd'

the default value 1 of argument 'c' is not type <type 'str'>

temp() has not so meny arguments 4

test success

另外还值得一提的就是，python3000 中的 pep-3107 提议一种给函数增加元数据的方式：

def foo(a: 'x', b: 5 + 6, c: list) -> max(2, 9):

不过这个东西并非为类型检查而生的，类型检查只是它潜在的一个应用而已，它本身只负责存储元数据，具体元数据是啥和元数据怎么用由第三方库决定，其他潜在应用包括：文档生成、rpc、与静态语言之间的交互等等。 2、重载。关于重载首先要说的一点是 python 中灵活的参数传递机制可以减少大量使用重载的场景，不过剩下那些基于实参类型的重载python仍然无能为力。而幸运的是我们有PEAK，其中有个RuleDispatch便是干这事的，而Guido这篇博客：Python 3000 - Adaptation or Generic Functions?说到要把这东西加到python3k中去，也掀起一阵热烈的讨论，只不过在这里我们不叫它重载，叫它Generic Function，但实质是一样的，就是根据传入的不同类型的实参调用合适的函数，比如：

>>> class PrettyPrinter:
...     @generic
...     def pformat(self, object):
...         """Return the pretty string representation of object"""
...         return repr(object)
...
>>> @PrettyPrinter.pformat.when(object=list)
... def pformat_list(self, object):
...     s = '['
...     for item in object:
...         s += (' '*self.indent) + self.pformat(item) + ',\n'
...     return s + (' '*self.indent) + ']'
...

然后当调用

PrettyPrinter().pformat([1,2,3])

时，实际调用到的函数其实是下面那个pformat_list 。 3、访问控制。 python是不对属性做强制性的访问控制的，而是依赖于约定，一方面是坚持相信程序员的信条，另一方面我觉的是确实不好实现，程序中对属性的访问是如此的常见，如果在运行时进行检查，效率上损失太大，得不偿失。 ruby 是进行强制性访问控制的，对象所有属性都只能通过方法暴露，然后对方法进行访问控制，也就是说，每一次你访问一个对象暴露出来的属性，实际上你都是通过调用一个方法，而调用方法之前访问控制机制还要先判断该调用地点是否可以调用该方法！所以说ruby慢不光是因为它的实现慢，它的语言设计本身就慢！(如对 ruby 有误解，欢迎指出) 另外还有一个原因是ruby中函数不是第一型对象，可以调用函数但不能获取函数对象本身。python中函数是第一型对象，函数对象本身可以当参数传递，而且class中的方法其实只是普通的函数而已，完全可以把一个外部定义的函数对象交给class给它当方法用，这带来巨大的灵活性，但也使得这种情况下对方法实现访问控制是根本不可能！你想啊：在class外部定义的函数自然是不能访问 class 的私有属性的，但是当它作为class的方法后就突然变得可以了吗？ 4、常量。常量换句话说就是只读的变量，在静态语言中它也是通过编译器在编译期间对代码进行约束。那么在动态语言中又该如何来实现呢？这个问题最近在两个邮件里都提出来：请教：在python中要实现类似define的功能怎么办？，怎么不用property来实现只读属性？。最常见的方法莫过于使用property实现只读的属性：

>>> class Person(object):
...     def __init__(self, name):
...         self.__name = name
...     @property  # 只读属性
...     def name(self):
...         return self.__name
...
>>> p = Person('huangyi')
>>> p.name
'huangyi'
>>> p.name = 'another'
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: can't set attribute

我记得我当时在邮件的讨论中就很详细得总结了一下各种做法，但是刚才去搜的时候，竟然发现它不见了！难道出现了幻觉？估计是gmail当时出了点问题。不过幸好我在另一个地方存了一份;-) 顺便把它改成了doctest的形式。下面直接粘贴一份语法加亮过的版本，你也可以在这里找到代码和语法加亮的html：

# -*- coding: utf-8 -*-
'''
感觉楼主的这篇和上次用 python 实现 define 的那篇帖子，想说的都是
一个东西，就是静态语言中的 const，第一次初始化后不能修改的东西。

说起来，python 对象中其实是有这样的东西，就是 imutable object 。
不过常量针对的是名字而非对象，所以在 python 中常量的准确定义应该
是：在第一次绑定后不能重新绑定其他对象的名字。

遗憾的是 python 中没有这样的东西。

其实和类型检查、访问控制等东西一样，静态语言中常量是通过编译器在
编译时进行检查，而 python 就算实现那也只能是在运行时进行计算，势
必损耗性能，我想这也是 python 中没有这样的东西的原因。

但是正如 python 中的访问控制是通过对名字的约定来做的一样，其实常
量也比较适合这样做。

如果实在要用动态语言模拟 const，那么关键在于对名字的绑定进行控制。

下面总结一下各种做法：
'''

def a_const_value():
   '''
    方法1是通过使用函数替代对名字的直接访问，好像是比较傻的方法。
    不过 ruby 中函数调用可以省略括号就有点像了

    >>> a_const_value()
    'const'
    '''
   return 'const'

class Temp(object):
   '''
    class 中通过 property 可以做得更漂亮：

    >>> t = Temp()
    >>> t.a_const_value
    'const'
    >>> t.a_const_value = 'another value'
    Traceback (most recent call last):
        ...
    AttributeError: can't set attribute
    '''
   @property
   def a_const_value(self):
       return 'const'

class ConstError(Exception):
   pass

class Consts(object):
   '''
    方法2是将常量名字放入一个 class 中统一进行管理：

    >>> consts = Consts()
    >>> consts.a = 2
    >>> consts.a
    2
    >>> consts.a = 3
    Traceback (most recent call last):
        ...
    ConstError: can't rebind const name

    不过需要注意的是，仍然可以通过 __dict__ 直接访问常量：
    >>> consts.__dict__['a'] = 3
    >>> consts.a
    3
    '''
   def __setattr__(self, name, value):
       if name in self.__dict__:
           raise ConstError, 'can\'t rebind const name'
       else:
           self.__dict__[name] = value

class ConstBase(object):
   '''
    或者让 class 自己指定那些是常量：

    >>> class Temp(ConstBase):
    ...     __consts__ = {'a':None, 'b':2}
    ...     def __init__(self, a):
    ...         self.a = a
    ...
    >>> t = Temp(2)
    >>> t.a
    2
    >>> t.b
    2
    >>> t.a = 3
    Traceback (most recent call last):
        ...
    ConstError: can't rebind const name
    >>> t.b = 3
    Traceback (most recent call last):
        ...
    ConstError: can't rebind const name
    >>> t.c = 5
    >>> t.c
    5

    使用这种方式，也可以直接通过 __dict__ 对常量进行修改：
    >>> t.__dict__['a']= 3
    >>> t.a
    3
    '''
   __consts__ = {}
   def __setattr__(self, name, value):
       if name in self.__consts__:
           if self.__consts__[name] == None:
               self.__consts__[name] = value
           else:
               raise ConstError, 'can\'t rebind const name'
       else:
           super(ConstBase, self).__setattr__(name, value)
   def __getattr__(self, name):
       if name in self.__consts__:
           return self.__consts__[name]
       else:
           return super(ConstBase, self).__getattr__(name, value)

if __name__ == '__main__':
   import doctest
   doctest.testmod()

2007年1月17日星期三

new blog template

哈哈，新年新模版！从 oswd 随便找了个没图片的模版改了改，open source 的东西就是好哇！越发感觉到 google 的 blog 做得专业，还专门给模版整了个模版语言，nb！

2007年1月16日星期二

Build extensible application with egg

在 python 社区中 egg 已经是一种广为人知的格式了。众所周知对于 python 来说 egg 是一种用来生小蛇的东西，小蛇自然就是 python 软件包了(纯属瞎扯 ;-)。简单地说，egg 之于 python 正如 jar 之于 java。是一种软件包打包的格式——要注意的是这个格式并非文件格式，实际上 egg 可以使用多种文件格式，当然最常用的还是 zip ——这里的格式主要是指组织其中包含的文件的格式。只是把软件包打包成 zip 格式的话，那就不足为奇了。egg 显然不光是用来干这个的，egg 最重要的作用是给软件包增加元数据！而元数据具体包含些什么内容和它的格式几乎没有什么强制的定义，这给基于 egg 的应用提供了大量发挥的空间。比如做PyPi软件包注册查找的机制、处理软件包之间的依赖关系等等，比如 setuptools 就是定义了一些元数据的格式，然后软件包的开发者只需按照这种格式把相关信息写进 setup.py ，setuptools 读取到这些信息后就可以帮你干这些事情了。除此以外，我觉的 egg 最有意思的应用莫过于做为插件系统了。相信把 egg 做为插件系统来用的几个框架大家应该都了解，像 turbogears 的模版插件系统，paste 的可扩展的项目模版，还有 trac 的插件系统等。一个插件系统莫过于这么几个步骤：框架首先定义一个/些插件的接口；第三方插件替你实现这个/些接口；插件注册；框架发现并使用插件。而 setuptools 和 egg 便可以帮你完成后面两个最麻烦的步骤：注册和发现。先来看个效果，以下代码能帮你找到你安装过的所有 paste 项目模版(下面用到的 pkg_resources 模块安装 setuptools 后就有)：


In [3]: import pkg_resources

In [4]: pts = list( pkg_resources.iter_entry_points('paste.paster_create_templat
e') )

In [5]: pts
Out[5]:
[EntryPoint.parse('pylons_minimal = pylons.util:MinimalPylonsTemplate'),
EntryPoint.parse('pylons = pylons.util:PylonsTemplate'),
EntryPoint.parse('myghty_modulecomponents = myghty.paste.templates:MCTemplate')
,
EntryPoint.parse('myghty_simple = myghty.paste.templates:SimpleTemplate'),
EntryPoint.parse('myghty_routes = myghty.paste.templates:RoutesTemplate'),
EntryPoint.parse('basic_package = paste.script.templates:BasicPackage'),
EntryPoint.parse('paste_deploy = paste.deploy.paster_templates:PasteDeploy'),
EntryPoint.parse('toscawidgets = toscawidgets.util:ToscaWidgetsTemplate')]

In [6]: pts[0].name
Out[6]: 'pylons_minimal'

In [7]: pts[0].module_name
Out[7]: 'pylons.util'

In [8]: pts[0].attrs
Out[8]: ('MinimalPylonsTemplate',)

In [9]: pts[0].load()
Out[9]: <class 'pylons.util.MinimalPylonsTemplate'>

可以看到最后一步我们就在运行时加载了那个实现了 paste 所定义接口的类。注册的过程就更简单了，把 entry_points 元数据往 setup.py 里一填，setup.py install 一装就 ok 了。这个过程中使用到的元数据叫做 entry_points ，entry_points 的格式很简单，其实就是 ini 的格式，一个 group 对应多个 name，


[group_name]
...
...

pkg_resource 中将 name 的格式定义为：

name = some.module:some.attr [extra1,extra2]

而 group_name 则由框架定义，其实就是对应一个接口，插件只需要将自己实现了这个接口的类列在下面，然后 setup.py install 一装，就 ok 了。现在 setuptools 和 egg 已经帮你干完了这些麻烦事，你还需要做的事情就只剩下：设计你的系统，定义你的插件接口。

2007年1月13日星期六

Recently...

没啥要紧事写的时候就起这个标题：Recently... ;-) 现在开始为下一个学期做点计划了，下学期看来就是我待在学校的最后一个学期了(也是最闲的一个学期;-) 一毕业就得去公司报到了，连暑假都没得过 ;-) 不如就把下个学期当作是我过的最后一个“暑假”吧。虽然是“暑假”但也不能闲着啊，想干的事情还有一大堆呢：

昨天邮件列表又见 ZoomQuiet 提起 Leo ，再次对文学编程思考了一番，越发觉得有其合理之处。Leo 的核心就在于 outline ，它可以用来表达任何事物之间逻辑上的关联，这使得 Leo 拥有了难以置信的灵活性，你可以用它来组织管理许许多多的东西。而同时它能将这种逻辑上的关联与物理上的文件(代码、文档等)联系起来，并能做到两者之间的同步，这使得它可以当个 IDE 来用！今天看 Leo 的网站的时候，惊喜地发现文档多了不少(上次就是文档不够在短暂地接触leo后就离开了)，这个在线教程做得非常棒，并且有了一大堆的 plugin 了。打算找时间好好研究下这东西。
使用 vim 的时间也不算短了，可还是时常感觉不能够运用自如，想来主要原因在于还没系统地看过 vim 的文档，而最缺的一块就是vim插件的编写了，等把这个搞定应该就能够达到运用自如的境界了吧。
在 python 社区里面混，竟然都不会玩 linux，整体用着个盗版 xp，自己想起来都觉的惭愧 = =" 。下学期要试着在 linux 下面工作入下门了。
毕业就要去公司了，c/c++ 还是得先搞一搞，python 当然还是要宣传的，不过毕竟是公司，肯定不能完全顺着自己的喜好来。
Pylifes 项目一放就是半年没动了，下学期的毕业设计尽量给它找个合适点的题目，继续搞。

2007年1月9日星期二

写了个方便下载 tudou 网视频的小程序

http://huangyilib.googlecode.com/svn/trunk/tudou_dl.py 只要给它视频播放页面的地址，比如： http://www.tudou.com/programs/view/AmYV7YnHqBU/ 它能帮你找出实际的 flv 视频下载地址： http://hot.tudou.com/flv/003/900/922/3900922.flv#81100#1 这可是我辛辛苦苦反编译了它的 flash 播放器的代码才找到的方法啊，希望 tudou 不要太快升级才好 ;-)

2007年1月3日星期三

Recently...

最近——正在翻译 Text Process in Python 附录A：选择性的令人印象深刻的 python 简短回顾这算是本老书了，可惜以往总是瞟一眼那短小的目录就感觉似乎都懂，没啥好看的，所以一直都没正眼瞧过它，最近偶然仔细看了几眼却惊喜地发现这竟是本不可多得的好书。就说这篇附录吧，我发现它正是我一直在寻求的对 python 的超精简的但绝不失深度的介绍。更别说书中还有大量的高质量的(pythonic的) python 程序。我想这篇附录最适合让已经有丰富的其他语言编程经验，甚至是有丰富的动态语言编程经验的兄弟来快速掌握 python 语言的精髓的了！最近——惊现新的模版引擎 mako (http://www.makotemplates.org/) ，myghty 杀手啊！估计最吸引眼球是它那个 benchmark 了： Mako: 0.90 ms Myghty: 5.25 ms Cheetah: 0.70 ms Genshi: 12.53 ms Django: 5.43 ms Kid: 19.12 ms Cheetah above gets a speed boost from native C extensions, whereas Mako is pure Python. 恐怖啊！ genshi 刚成取代 kid 之势，看来 myghty 就快要被 mako 取代了，呵呵。 genshi/kid 的特点在于方便灵活的 xml 生成；mako/myghty 是用来生成任意形式的模版的，特点在于将 python 语言优雅地植入到模版中，并且将模版编译成 python 代码，获得极高的性能。而这两组中的前者都超越了后者一大步！难道 genshi 和 mako 要平分(django以外的)模版世界了？拭目以待。另外由于 mako 将模版查找的逻辑抽象到灵活的 TemplateLookup 里面 (窃以为是学习了 django 模版的思想)，我想在要在 django 的基于 app 的架构中应用 mako 模版应该不是难事，有机会要尝试一下。再另外 mako 模版还吸收了 django 模版中的 filter 的概念。最近—— cherrypy3.0 发布了 ( http://cherrypy.org/wiki/WhatsNewIn30)，大量的重构！我感兴趣的变化首先是："CherryPy 3 is much faster than CherryPy 2 (as much as three times faster in benchmarks)." 其次就是 web 服务器和逻辑服务器的完全分离了，说 cherrypy 的 web 服务器是目前最优秀的 wsgi服务器应该没有人会反对 ;-) 另外："cherrypy.Application objects are now WSGI applications"，也就是说 cherrypy 的 url dispatcher 直接处理的就是 wsgi 应用程序了，好处不言自明，呵呵。另外，刚又发现了这篇文章 cherrypy 3 has fastest WSGI server yet. 和 aspen：一个基于 cherrypy wsgi server 的 web server，目的是方便各种风格的 web 应用程序以统一的方便的 pythonic 的方式进行部署。直到最近——才发现原来 routes 已经支持了 REST 形式的 dispatch 了，跟 rails 还跟得挺紧，呵呵。还发现不少新特性： # Sub-domain support built-in # Conditional matching based on domain, cookies, HTTP method (RESTful), and more # Easily extensible utilizing custom condition functions and route generation functions

订阅：评论 (Atom)

2007年12月18日星期二

2007年9月26日星期三

2007年6月26日星期二

2007年6月22日星期五

2007年6月20日星期三

2007年6月19日星期二

2007年6月8日星期五

2007年6月3日星期日

2007年5月30日星期三

2007年5月26日星期六

2007年5月22日星期二

2007年5月21日星期一

2007年5月15日星期二

2007年5月1日星期二

2007年4月2日星期一

2007年4月1日星期日

2007年3月30日星期五

2007年3月27日星期二

2007年3月18日星期日

2007年3月17日星期六

2007年3月12日星期一

2007年3月9日星期五

2007年2月7日星期三

2007年2月6日星期二

2007年2月5日星期一

2007年1月27日星期六

2007年1月26日星期五

2007年1月24日星期三

2007年1月22日星期一

2007年1月19日星期五

2007年1月17日星期三

2007年1月16日星期二

2007年1月13日星期六

2007年1月9日星期二

2007年1月3日星期三

Profile

Friends

Recent Posts

Recent Comments

Tags

Who"s Reading

Blog Archives

My Recent Delicious

My Delicious Tags