Tag > python

REST stuff in GAE

by airportyh posted at 11-02-2008 12:45PM - Comments (0)   gae programming python
I've been working on getting justtodolist.com to use REST style URLs. It feels like I increasingly post my technical stuff on stackoverflow.com though, as was the case this time. So I'll just post a link to my stackoverflow question.

Python optional arguments gotcha

by airportyh posted at 10-15-2008 02:54PM - Comments (0)   programming python
When you use Python optional arguments you should know that the default value you set is static, it is set at the time the function is defined, and does not change after that. If you a mutable type as the default value, you need to be careful.
Ex:

>>> def enlist(n, lst=[]):
...   lst.append(n)
...   return lst
...
>>> enlist(3)
[3]
>>> enlist(4)
[3, 4]
>>> enlist(5)
[3, 4, 5]
>>> enlist(6)
[3, 4, 5, 6]


Everytime you call the enlist function, the list gets bigger, because lst is never reset to the empty list. It is set to the empty list only once, when enlist was defined. After that it references the same instance everytime you don't supply a lst argument.

Selenium-rc proxy server war story

by airportyh posted at 10-01-2008 11:04PM - Comments (0)   java maven programming python selenium
I am trying out selenium yet again to help our non-existent QA team. I have to admit that I kinda flaked out on unit testing too(I kinda got sick of TDD in rails bogging me down to be honest), which I mean to catch up on, but, for now, I wanted to focus on selenium, since, with an ajax app like ours, too many things could go wrong, it could be either client side or server side or a combination, so intergration testing I think can really buy us a lot. Besides, I really wanted to know if selenium works well or not.

Anyway, this is not a post about selenium, exactly, I'll probably dedicate another to it later, when I have had sufficient experience with it. The problem I faced in this episode is that when I run the app through the selenium-rc proxy server, my flash messages aren't showing up! Yes, the proxy server is muffling my flash messages, what could it be? This first thought was it had to be a cookie issue since the flash message is implemented as a one-time use cookie, but the other cookies worked fine, or I wouldn't have been able to login to the app.

From looking at the cherrypy code, it looks like it's setting 2 cookies, one for auth, the other for the flash message, and it generates 2 Set-Cookie headers. I suspected that this was screwing up the selenium proxy server. I decided to dig into the selenium server code and put in some debug statements.

Aww! Getting the selenium-rc source and then building it with maven 2 is back to the painful slow Java days. Man! Maven, do you really have to download the entire internet just to build the project? Maven just symbolizes all that I hate about Java. It's bloated; it's framework heavy; it forces things on you that you don't need; running a build with it is glacial; build plugins is a big hassel, did I leave out anything? I think Maven may be the worst thing that has happened to the Java community...but I digress.

Basically I tracked it down to the part where the proxy server gets the header fields from the HttpURLConnection object(part of the standard Java API), but it drop the flash message cookie somehow. I suspected it was because that API just doesn't cope with duplicate header names in the response, but that seems strange that this has never come up. Googling found that people have been able to get dupliately named headers using the getHeaderFieldKey(int n) and getHeaderField(int n) methods, which take in a positional parameter. This is what the proxy server was doing, but yet it didn't work.

I used wireshark to look at the packets to confirm my theory, it did, but I also found another interesting thing - there is an empty line between the first Set-Cookie header and the second. I came to me that this is probably what tripped up the HttpURLConnection code, and I was right. I changed the Cookie code in the standard library to use \n instead of \r\n(so it interpreted as an empty line) as the separator and it solved the problem.

But I don't want to just patch a standard python library like that, it's not deployable. Don't know whether I should fix this in selenium-rc or not, in which case someone suggested I use the apache httpclient in place of HttpURLConnection.

Update: It turns out you can use the same trick as I displayed here to patch Python libraries, which is what I did:
    import Cookie
    def make_myoutput():
        org = Cookie.BaseCookie.output
        def myoutput(self, attrs=None, header='Set-Cookie: ', sep='\n'):
            return org(self, attrs, header, sep=sep)
        return myoutput

    Cookie.BaseCookie.output = make_myoutput()

Web Dev with Google App Engine

by airportyh posted at 07-29-2008 10:32PM - Comments (0)   appengine google programming python
I wrote my first Google App Engine app! It's located at justtodolist.appspot.com. It's yet another todo list - I have been a tadalist user for a while and thought I could make it slightly better, and so I did. Here is to jot down some thoughts on GAE.

First up, the things I like.
  1. The number one benefit of GAE for me is definitely one step deployment. No that you couldn't step up one step deployment for rails apps, but it just takes a lot of work to set this stuff up the first time. With GAE, it's one step deploy the first time. I would say it's easier even than php. A large part because of benefit number two
  2. there is no database to set up. As most people know by now, you use GQL/Big Table on GAE, and it is very different from relational databases. Setting up is really minimal. You specify your model in a DSL similar to what you'd write with SQLObject or Elixir, or Data Mapper for Ruby folks. And then boom! you are running.
  3. There's no user authentication to setup either, it's basically Gmail authentication, if you are willing to go along, that is. The User API is very simple, and you can start using it right away
  4. Development server is nice, it picks up changes immediately when you save any project file
  5. The number of projects files you have is very minimal, it's very non-cluttery.
So as you can see GAE is great for rapid prototyping. Now for things I am not that crazy about. Actually, most of the benefits I spoke of has some caveats:
  1. Although deployment is easy. Sometimes issues arise from the fact that things work slightly differently in production vs development. Such as, you need indices to build fully in production for the app to be ready to run, or for some reason, transaction rules work differently in production vs development(I haven't dug down to this fully, but it might be a bug)
  2. Big Table is cool, it's supposed to be super scalable, but there are a couple of things that are annoying about it. I can get over the fact that it's fundamentally different from relational databases: things like you can't do aggregate queries, joins and so forth. For performance too, some things you will just have to do differently than you would normally with relation databases. I am okay with that. What pains me is that there is no proper data migration path. When you change your models(add or remove fields, and so forth), the old stuff just stick around. To migrate the old data, you basically have to manually write a script that loops over the existing data structures and modify them, but the script has to be triggered from an http request just like everything else because that's the only way to run your code on GAE on the production server... getto! Also, I am extremely annoyed that while in development you can just clear your datastore. There is no analog in production. I realize though that this is a work in progress and that things will get better in the future.
  3. Well, the caveat for using Gmail authentication is that... your users must have a Gmail account, duh... I am sure you can use other authentication schemes if you want, I don't see anything preventing that
  4. Yes the development server is nice, but for some reason it was progressively running slower on my company Dell D820 laptop. This is so especially if you perform extensive modification of the data models.
Here are some other thoughs:
  1. The development console is not good, man! It's no where near the usability of the python interactive shell. I've heard of an alternative but haven't seen it yet.
  2. I haven't dug into how to do TDD with it yet, but I've read it's possible.
  3. Again, Big Table is VERY different from relational databases. I originally ported my todo list app from a SQLObject -> MySQL backend, with only 2 data models: TodoList and TodoItem. In Big Table we still have the 2, but they look kinda different. I had to change it because of performance reasons. I'll put the discussion of that on a separate post.
  4. I haven't got a great handle of how transaction/entity groups work. I thought I did, until my transaction code didn't work, will have to look closer into it. Documentation on this is kinda sparse. Right now my code is non-transactional.
Open Source GAE Apps?

Here's another thought. Will it be possible to popularize open source GAE apps? I think the most important reason for the popularity of open source php apps is the ease of deployment. With the ease of deployment of GAE, I think conditions might be ripe for there to emerge a movement of good open source GAE apps. Then again though, people might resist the vendor specific/non-open source nature of GAE. We will see.

Impressions of Turbogears 4 months in

by airportyh posted at 07-17-2008 10:53PM - Comments (0)   programming python ruby turbogears
I've been a user of Turbogears for 4 months now, working on a client facing app. The app has not gone production yet, so I don't have much insight on deployment, but I have a lot of experience on the development side - I started from scratch - and here is what I learned so far.

Freedom of Choice
Turbogears as a framework is pretty agnostic of different components such as ORM, or template engine. Although there is a default choice, I found it wasn't hard to stray away from it.

Mako
I ended up choosing Mako as the template engine because, coming from rails, I felt kid and genshi were too heavyweight for my taste since they are based on XSLT and requires your markup to be valid XML before it can do anything, which obviously means there's an XML parsing step it has to do. Mako is more like erb in that it's "text-based", e.i. it's perfectly fine to render non-valid XML code. But Mako turned out to be much more than another erb. With Mako you can easily write helper template functions and reuse them everywhere. You can also write inversion-of-control template style functions which take in a partial template and calls it inside its body. It always puzzled me why you couldn't do that with erb or haml or most of the ruby template engines easily. With erb, you have to write a partial view as a separate file, but calling a partial with local parameters is inconvient, you have to write something like:

render :partial => 'my_control', :locals => {:control_id => 'con', :height=> '50px'}

and since this is so inconvient, i usually end up wrapping it with a helper method like:

def my_control(control_id, height)
    render :partial => 'my_control', :locals => {:control_id => control_id, :height=> height}
end


Of course this is partially due to the culture in rails that you normally write helpers in ruby rather than in a template language. In Mako you don't have to do this extra step, which makes me happy. Now if you want to do the inversion of control thing, in erb it's even worse! In mako you would just do this. So, in general, I am able to refactor my views a lot easier with mako and therefore I find myself doing it a lot more often.

SQLAlchemy
SQLAlchemy is a main stream ORM in the python community. It's direction is different from that of ActiveRecord and is a lot more similar to Hibernate of Java but also has similarities to Ambition of Ruby and Linq of .NET. It is similar to Hibernate in that it is very fully featured, has sessional transaction management, and can coupe with a large variety of schemas. It is similar to ambition and linq in that you can contruct queries in your host language in a very succint and elegant way(I know you can build queries in Hibernate's criteria API too, but it's not quite elegant). I like SQLAlchemy a lot! Here's a couple of sqlalchemy tricks I like. First one:

    fields = [
        User.user_name,
        User.display_name,
        User.email_address
    ]
    results = User.query.filter(or_(*[f.ilike('%' + q + '%') for f in fields]))


The above code does a wildcard partial string match of the string q against any of the three fields listed in the fields list. Second example:

    page = User.query[10:20]
           
This looks like array slicing, but no, it's slicing against the query results! It's smart enough to build the query using OFFSET and LIMIT or equivalent. You can easily do pagination with this technique.

Python's Named Parameters
Another thing I like when working with turbogears is python's named parameters. Whereas rubists use the hash as the poor man's name parameters, python has real named parameters, which is not only safer, but more elegant.

Python's Polluted Name Space
I run into this problem once in a while, but I hit on it 2 or 3 times in the last week! In python, list, dict, str, int, etc. are the names of fundamental types in the language, therefore you can't(actually you can, but don't want to) use them as variable names. More than once I've tried to use list as a variable name, which python doesn't complain about immediately but causes a cryptic error down the road.
I've also tried use from as a variable name, which turns out to be a keyword in the language, this causes a syntax error, which you don't see until you realize it's a keyword. Now, of course, more languages suffer from this problem, ruby isn't any different, but I think ruby has better error messages for these syntax errors: it will tell you the symbol that is unexpected and what symbols it was expecting.

Little Verbosity
Turbogears is more verbose than rails most prominently in 2 areas: import statements and method decorators. Rails files, usually have at most 2 requires(counterpart of import in python), most of the time none at all. My TG controller files and the model.py(the file with all the model objects) usually have about 10 to 15 lines of includes, my Mako template files usually 2 to 4 lines of includes. This has a lot to do with the design of the language. The python interpreter requires each .py file to act like a module, and as a module, it must identify all of its dependences, the ruby interpreter does not require this and so your controller code, for example, doesn't need to explicitly require anything before it has what it needs to do its work.
I think method decorators are cool, but they can also be overused and become cluttery. Some of my controller methods have more than 4 or 5 lines of decorators, consisting of the mandatory expose(), access restriction spec, and form parameter validators. That's a bit much. I also don't like the fact that you need an expose() decorator for every single controller method. Rails has no such thing and usually specifies such things at the top of the class, which has pros and cons vs TG's approach, but is at the end less cluttery.

parameter list chaining in python

by airportyh posted at 07-15-2008 03:12PM - Comments (0)   programming python
Python has great support for variable length parameter lists, you have optional parameters:

>>> def o(opt=1):
...   print opt
...
>>> o()
1
>>> o(2)
2
>>> o(opt=3)
3


then you have the one star for any number of parameters:

>>> def f(*params):
...   for p in params: # params is a list
...     print p
...
>>> f(1)
1
>>> f(1,2,3,4)
1
2
3
4


and you got two stars for keyword parameters:

>>> def g(**kws):
...   for item in kws.items(): # kws is a dict
...     print item
...
>>> g(one=1)
('one', 1)
>>> g(one=1, two=2, three=3)
('three', 3)
('two', 2)
('one', 1)


You can also use them together along with regular parameters and optional parameters as long as they follow the ordering: regular > optional > single star(varargs) > double star(keyword args):

>>> def h(req, opt=None, *params, **kws):
...   print 'req=', req
...   print 'opt=', opt
...   print 'params=', params
...   print 'kws=', kws
...
>>> h(1,2,3,4)
req= 1
opt= 2
params= (3, 4)
kws= {}
>>> h(1,two=2,three=3,four=4)
req= 1
opt= None
params= ()
kws= {'four': 4, 'two': 2, 'three': 3}

>>> h(1,two=2,three=3,opt=4)
req= 1
opt= 4
params= ()
kws= {'two': 2, 'three': 3}


If you have a handle on a list or a dict object, you can use it/them directly as the parameter list when you call a function:

>>> lst = [1,2,3]
>>> f(*lst)
1
2
3
>>> dct = dict(one=1,two=2,three=3)
>>> g(**dct)
('one', 1)
('three', 3)
('two', 2)


A techique I developed is to chain function calls lazily, sort of a poor man's partial function application. Say I have a function that takes a lot of optional parameters, let's say it generates a text field of some sort:

def textfield(label, name, size=10, classname='', id=None, onclick=None, onkeypress=None):
    # blah

Now for my specific screen I want to specify a number of these parameters by default, and only vary the label and name parameters the rest of the way, so I do something like:

def tf(*params, **kws):
    textfield(size=20, classname='cool-field', *params, **kws)


now I can call it like this:
tf('Name', 'name')
tf('Birthdate', 'birthdate', onkeypress="checkBirthdate(this)")


The cool part about this is that I can add new optional parameters to textfield without having to change tf and yet still be able to use the new parameters from tf. I could have just as well use the partial function from the functional library, but this is pretty easy too.

Getting the Argument Names of Your Function

by future, airportyh posted at 05-20-2008 11:50PM - Comments (0)   javascript programming python ruby
In Javascript:
myfunction.toString()
This gives you the entire definition of the function as a string, with which you can parse to get the argument names, or in prototype.js you can just do:
myfunction.argumentNames()

In Python:
myfunction.func_code.co_varnames

In Ruby, there is no easy way to do this, although there is A way, which involves setting a trace function on the function and then executing it to get at the local variables at execution time, see here.




Turbogears and ReML

by airportyh posted at 03-20-2008 10:10AM - Comments (0)   haml programming python rails turbogears
I followed the Turbogears 20 minute tutorial here. The first impressions of Turbogears are:
  1. It's a bit more verbose than rails: there's more plumbing - you have to explicitly define which view you want to use for each action, as supposed to doing everything based on convention.
  2. you explicitly pass the local variables to the view as a hash, as supposed to using class or global variables
  3. Turbogears uses a template engine called kid by default, which is very different from rails' erb in philosophy, there's more emphesis on designer friendliness and higher level support for template inheritence
  4. Python/Turbogears in general is safer than ruby/rails - see my last post, in that you usually get more informative errors, such as NameError: global name 'pag' is not defined rather than nil when you didn't expect it
  5. The development feedback is not quite as good as rails. Turbogears requires a restart everytime you make a change. The restart is automatically triggered everytime you save a file in the project, and it is very fast, but it still takes about 5 seconds to rails' 0(ruby has this luxury because of its open classes)
I am hip to Haml so I had to see if I could get it working with Turbogears. I found a couple of implementations: ReML and GHRML. Tried them both, ReML was simpler and more approachable, so I wrote a Turbogears plugin for it. The important bit of the plugin code is here:

from reml import TemplateLoader

class RemlTg(object):
 
  def __init__(self, extra_vars_func=None, options=None):
    pass

  def load_template(self, templatename):
    "Find a template specified in python 'dot' notation."
    parts = templatename.split('.')
    return TemplateLoader('/'.join(parts[0:len(parts)-1])).load(parts[len(parts)-1] + '.reml')

  def render(self, info, format="html", fragment=False, template=None):
    "Renders the template to a string using the provided info."
    return self.load_template(template).render(info)


After that, I converted the views in the tutorial into ReML. Let me do a wc on them for comparison, wait just a minute...

$ wc wiki20/templates/*.kid
  25   76 1068 wiki20/templates/edit.kid
  71  173 2802 wiki20/templates/master.kid
  21   56  773 wiki20/templates/page.kid
  21   50  705 wiki20/templates/pagelist.kid
 138  355 5348 total
airport@wedding-singer ~/documents/play/turbogears/Wiki-20
$ wc wiki20/templates/*.reml
  16   41  492 wiki20/templates/edit.reml
  38   86 1338 wiki20/templates/master.reml
  12   30  327 wiki20/templates/page.reml
   8   24  190 wiki20/templates/pagelist.reml
  74  181 2347 total


So that's about a 50% code reduction, not bad. Here's a Sample for comparison:

page.kid:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
          "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:py="http://purl.org/kid/ns#"
      py:extends="'master.kid'">
<head>
<title> ${page.pagename} - 20 Minute Wiki </title>
</head>
<body>
    <div class="main_content">
        <div style="float:right; width: 10em">
            Viewing <span py:replace="page.pagename">Page Name Goes Here</span>
            <br/>
            You can return to the <a href="/">FrontPage</a>.
        </div>

        <div py:replace="XML(data)">Page text goes here.</div>
        <p><a href="${tg.url('/edit', pagename=page.pagename)}">Edit this page</a></p>

    </div>
</body>
</html>


page.reml
- append('master.reml')
- def title():
  =page.pagename
- def content():
  %div: 'style':'float:right; width: 10em'
    Viewing
    %span=page.pagename
    You can return to the
    %a: 'href':tg.url('/')
      Frontpage
  %div=unescaped(data)
  %a: 'href':tg.url('/edit', pagename=page.pagename)
    Edit this page