Time to once again follow in the footsteps of those greater1 than myself. A quick google search will turn up all the other blog posts that have been written since Brian D Foy inspired Titus Brown. I made my initial list of five shortly after PyCon when I first read the post and commentary on Titus’ genesis post. Work and other projects have been taking up all my time, so I never got around to it. In the past two weeks each of these five things has come up at work, to the point that I have started working on a patch, and a PEP.
1. Decorators are lossy.
UPDATE: Peter Fein mentions his decorator module in the comments which solves this problem. See that page for a better description of the problem and how to fix it. We need a Py3K PEP to include this functionality in the distribution and is function annotation aware.
More to the point, it is extremely hard to get the function or method declaration information onto the new wrapping decoration.
def threadsafe(func):
locallock = threading.Lock()
def _threadsafe(*args, **kwdargs):
locallock.aquire()
res = func(*args, **kwdargs)
locallock.release()
return res
_threadsafe.__doc__ = func.__doc__
return _threadsafe
@threadsafe
def myfunc(fileobj, nslices, throw_on_error=True):
"""myfunc does something
fileobj - must be an actual file object, not a file like object
nslices - the number of columns in tabular form. If throw_on_error is True,
then this must match exactly the columns in the file.
A bunch of doctest code
"""
pass
>>> help(myfunc) Help on function _threadsafe in module __main__: _threadsafe(*args, **kwdargs) >>>
Doc strings are the most obvious problem as they break doctest, but they are the easiest to solve. There was discussion about this on the dev list about a year ago, and a preliminary patch to add a decorator-decorator to deal with most of this, but it was shot down primarily because doc strings were considered the most important trait to transfer and no one really cared about the rest of it.
>>> help(myfunc)
Help on function _threadsafe in module __main__:
_threadsafe(*args, **kwdargs)
myfunc does something
fileobj - must be an actual file object, not a file like object
nslices - the number of columns in tabular form. If throw_on_error is True,
then this must match exactly the columns in the file.
A bunch of doctest code
>>>
Great.. that cleared up everything. I have been known to go to extremes to get around this limitation. I have a 900 line utility module just dealing with decorators and issues like this. As I have workarounds I have not raised a big stink about it, but Python 3000 changes everything with function annotations. Imagine now you have something like:
def __import_ex__(name:Sequence[str], caller__name__:str,
caller__path__:(Sequence[str]|None)=None) -> object
In Py3K, there is allot more being lost and convoluted workarounds will not be acceptable. Once people really start using them to get real work done, the need to ‘clone’ a definition will become apparent and someone will come up with a proper solution. Who knows, it might be me.
2. sys.exit, Py_Exit, and SystemExit
This is the one which pissed me off for the last time and got me finally writing a patch for what I considered to be a bug. On further reflection it is a PEP. sys.exit and Py_Exit should either be something which terminates the process, or it should be a special python exception which does not result in an error stack if unhandeled. Currently it is both and neither. “Py_Exit(3)”, “sys.exit(3)” and “raise SystemExit, 3″ are all equivalent. In truth all that sys.exit() does is set the python exception. If there is an unhandeled exception and it is the SystemExit exception, all the PyRun_* code will omit the exception printing and call stdlibc exit(). This allows atexit to do its work, garbage collection and some of the extension module stuff to clean up properly, and Py_Finalize to occur. What this also means is that code can catch the SystemExit exception and ignore it. When this happens, nothing happens. You do not have a guaranteed means of, from python, terminating the python process with an exit code, even from the python C/API. Calling Py_Exit() could result in the exception being caught and the interpreter not being shut down. Worse yet, if you get the exception unhandled, control is not passed back to Py_Main and the ‘-i’ option is ignored!
print "sorry... you have no clue where or why the script ended..." print "No interactive mode for you!" print "importing this as a module will terminate even IDLE!" raise SystemExit, 5
Mya@miyu ~ $ python -i exit.py sorry... you have no clue where or why the script ended... No interactive mode for you! importing this as a module will terminate even IDLE! Mya@miyu ~ $
The really messes up some IDE’s and debuggers, and completely screws over people embedding python in another application. None of the PyRun_* calls return if they catch SystemExit, they call exit() instead. Thanks allot! Either provide a way to call exit() and terminate the process, or have a means of stopping the interpreter cleanly. What we have currently is sometimes one, sometimes the other, many times neither. The PyRun_* calls need to return the exit code and Py_Main needs to deal with it properly. I do not believe there is any reason to call exit() ever.
There is never any time to write patches or get into discussions on these things. Thankfully some of them have caused us enough problems that they are being put on the schedule at work, so they are no longer things I need to spend my ’spare personal time’ on. Once I get the preliminary Py_Exit PEP done, I will pass it by first by the python-users and then python-dev lists to get feedback and support. If it’s not in before 3.1 that is fine, but I will be in shock if people believe the current behavior is the proper behavior.
3. No Py_Initialize hook
For Py_Finalize we have Py_AtExit and the atexit module, but for people embedding python in another application, we have no Py_AtInit. This might seem silly, but we need a better bootstrap system. To get around this we embed the site module. That right, we have the site module as a builtin, and part of what it does is go out to disk and import the real one after our system initialization is done. This is actually a nice way of dealing with things, especially for debugging our code, as the ‘-s’ option will stop site import. It is still a hack, and for all its nice side effects, it is still an ugly, ugly hack. There should be an approved and supported way of doing this.
4. .pyc .pyo compile location inflexibility
The compiled python files must be in the same directory as the .py files. You cannot compile them to another directory and have them still refer back to the .py file for proper exception stacks and debugging. This is a major problem for people who like clean build systems. We have taken to doing a custom compile, then moving the file into the proper build directory and running from the .pyc/.pyo files. This requires custom tools, cruft in your source directory if you abort a build at the wrong time, and in general is an ugly hack. For certain network resources it would be nice to have one version of the source files and different .pyc/,pyo files in different directories depending on the version of python. Some custom import hooks can achieve this to a point, but inorder for it to work, once again, you need to first compile the file in the .py directory then move it, so that the .pyc/.pyo has the proper .py file listed in its header. This means you cant have your source release directory read only. And no, precompiling is not always an option. Jython had this ability back in 2000, but we don’t have it in any other python implementation that I know of.
A simple change to compile() should do the trick. Between that and the work Brett Cannon is doing on the import system, it should be trivial to implement any inane import/compile customization one could ever want to hang themselves with.
5. No .so/.dll zip import
This is not really true, there was .so importing from zip’s at one point, but it had issues. There was a patch for .dll imports from a ‘compressed directory’ but again there were issues, and no Mac support. Being able to package everything up into a single zip file would be very very nice, and help enforce that the proper library was loaded with the proper source. Currently if you use a zip with a private interface (_sre.so is an example of this), you need two entries on the python path, one for the zip and one for the shared libraries. This can cause problems. I can’t count the number of times I have had to add ‘-vv’ to the commandline options of some experiment to diagnose some bazaar behavior that was due to loading the wrong .so for the parent .py.
Relative imports will solve the vast majority of the problems, but it would still be nice to have it all in one package as sometimes people forgot to copy the .so’s as well.
Conclusion
Everything here has to do with embedding and extending python for custom environments. I would be interested in hearing form other people who do this. This is a small part of my day job and oddly enough one of the least interesting. These are the things that are painful, everything else python related (except SWIG) just works and thus goes unnoticed. The really interesting stuff I can’t talk about. It sucks working on mind blowing technology and not being able to talk about it, but its better than the alternative; talking about working on uninteresting, boring, junk. I guess I just have to give up on sleep.
1. This is not self deprecation. There is no doubt of the greater impact of their personal work on the greater open source community.

Hey there. I agree with the exit bit. didn’t quite get the thread part. The “mind blowing” technology looks pretty awesome, they should spend a tad more money on advertising though, the one with the joggers was a little lame
You know about the decorator module, right? It handles exactly that problem.
Dunno if you mentioned it or not, I got lost in all the ranting.
What about functools.wraps() for the “decorators are lossy” problem?
Translated:
The threading was just an example of a common decorator which does not change the argument spec of the function it decorates, but by decorating it, you loose that argument spec which is used by help, doc generate, and other decorators like those in the functools module.
Peter:
Yes I know about the decorator module, and I have my own version of it. The problem is those calls do not do what this one does:
http://derivin.addr.com/pycon2006/dec1.py
Which is preserve the argument spec of the original function or method. If the origional has ‘def foo(a, b, c=None)’ in module ‘testing’ then I want all that information on the wrapped version preserved (And no Brett, I do not want nor support tuple arguments, they suck).
UPDATE: I AM WRONG. I does exactly what I want!
I want the function name, module and arguments to be preserved while still being wrapped.
Simon:
I know about this module and the first function proposed for it was the decorator decorator, which did what I wanted (but in a way thata sometimes crashed, didn’t support default args, and had major problems with tuple arguments).
>>> def dec(f): ... @functools.wraps(f) ... def wrap(*args, **kwds): ... print "Hello" ... return f(*args, **kwdargs) ... return wrap ... >>> @dec ... def foo(a, b, c=None): ... """doc string""" ... print a,b,c ... >>> help(foo) Help on function foo in module __main__: foo(*args, **kwds) doc string >>>Not quite good enough. I want to see my argument names and default args. With the advent of function annotations, this problem will be compounded. Annotations will be used for interesting things like adding strong(ish) typing, argument validation, more interesting functools, WDSL generation (*shiver*), and rich documentation. If adding a pass through decorator removed the annotations, things will break. That will be unacceptable to many people.
Peter:
(NOTE: Peter’s solution looks like it was released before I started working on my own.)
Yup! thats exactly what I want, and you have no Idea how happy I am someone else also came up with a solution! I don’t know how I missed you module before! I am going to go with the excuse that I came up with a solution for myself back in 2005 and haven’t bothered looking that hard since then
For Py3K we really need this functionality builtin. Interested on working on a PEP together?
PEP 362 (function signature objects) was partially motivated by decorators that hid details of the wrapped function. The hope was that introspection tools would use the signature object instead of what was on the function and thus ignore the typical decorator’s “*args, **kwargs“ signature.
As for #4, I think you right that my import stuff will help. You can probably get away with having two entries on sys.path (one for the .py files, the other for the .pyc/.pyo files). You then have a custom handler for the loader that deals with the. py files that knows to write out all .pyc/.pyo files to the other directory. A custom loader for the .pyc/.pyo files would also be needed to set __file__ properly for the exceptions (I think that is all .pyc files need).
Then again, .pyc/.pyo files need a redesign. They really should be merged into a single format and probably contain metadata that specifies what file it was created from so that this is all much simpler.
Brett:
I really should have done better research on #1, as it seams I missed many things. I knew about PEP 362, but was not up to speed on the work done on it. I thought it was going to be rejected (from early dev chatter back in October). I haven’t kept up much with the dev work since then. It’s good to hear that there is a sandbox implementation which works with annotations. I will try to get back up to speed on this.
On the compiling of .pyc/.pyo files, if you write a custom compiler you can enforce that the __file__ is set to an absolute path. This is what we are currently doing, but without your import extensions and bootstrapping, you still need to compile to the same directory as the .py. I Don’t think you really need two directories on the path, you can have one directory for the .py files, and then as part of the custom import, before doing the compile step, first extend the search to that files directory + say a directory with the python version number. This is they type of hack that is not possible with the current import hook.
allot
-verb
to divide or distribute by share or portion; distribute or parcel out; apportion: to allot the available farmland among the settlers.
a lot
-adverb
to a very great degree or extent; “I feel a lot better”
Grammaman (or Grammar Man?):
(Is that you Joev?)
Thanks, I will make every attempt not to make that mistake again; (I have made it a whopping 38 times already.) I make no secret of the fact that I can not spell to save my life.
[...] seems most common among the Python crowd: Python 2, Python 3, Python 4. It’s probably because everybody loves Python. But, I did manage to find a few others: C, [...]