Tools for Stock Market Analysis in Python

Python Programming Techniques

Copyright 2010 by Stephen Vermeulen
Last updated: 2010 Jul 17
Testing Python Programs





apr background base beyond come concept conditions consider coroutines d7 dec department detailed difference doc easily effbot elements evil flat followup forward functional hr ianbicking idyll impact improved introducing jtauber lets major mention method none order overall plugins points presentation protocolostomy rules select several sys tbody techniques title unique usage wrong zone
Python is a cross-platform, application development and scripting langauge, this page contains some simple (and not so obvious) coding techniques and notes.

  • 2010-Jul-17: In Python one can easily add the contents of two lists together with a statement like:
    a = b + c
    
    where b and c are lists. But if you need to join a large number of lists together into a single list using the extend() function (of the list object) is much faster. In one rather extreme case where I needed to build an overall list of about 1 million points from individual lists that were about 400 points long, switching to the extend() function sped this up about a factor of about one thousand. So the faster code would be something like:
    a = []
    a.extend(b)
    a.extend(c)
    
    [9312]
  • 2010-Jun-16: Exceptions are your Friends is a quick reminder of how to use exceptions. [9205]
  • 2010-Feb-05: Using super in Python 3 (and perhaps earlier versions too). [8951]
  • 2010-Jan-27: Code reloading and event notifications talks a bit about how to dynamically reload code. [8939]
  • 2010-Jan-26: Flat is better that nested puts forward the point that flat code is easier to understand than nested code, which I certainly find to be true. [8936]
  • 2010-Jan-08: An explanation of why you cannot pickle generators. [8900]
  • 2009-Nov-27: The "__slots__" mechanism can be used to reduce memory usage in Python programs. [8808]
  • 2009-Nov-20: The py.path svn wrapper module uses a sneaky "__" module name to keep some internals private. [8776]
  • 2009-Nov-10: Some changes have been made to how Python lambdas access the surrounding state variables. [8724]
  • 2009-Nov-05: Python Quirks in cmd, urllib2 and decorators [8706]
  • 2009-Nov-03: Installing Python on demand on Windows. [8691]
  • 2009-Oct-23: Import magic: splitting a Python package into several sub-packages in separate directories. [8667]
  • 2009-Oct-23: Python packages" how to create and use them. [8665]
  • 2009-Oct-23: Avoiding side effects when importing modules: don't include top level code in a module if you want to be able to easily and safely reuse it. [8645]
  • 2009-Oct-09: The argument against flattening lists: if you need to do this you've built your list wrong. [8612]
  • 2009-Sep-29: Python: Aggregating function arguments talks about the various ways to call functions and how to receive and process those arguments. [8584]
  • 2009-Sep-22: pyreport enables a form of literate programming with Python. [8554]
  • 2009-Sep-09: A detailed look into why the Python Pickle is insecure. She also suggests that unpickling can be made safer by controlling the classes that can be created by overriding the find_class method. [8496]
  • 2009-Aug-26: Using the new With statement to implement multi-line Lambdas in Python. Scary, very scary. [8426]
  • 2009-Aug-26: From the "why, oh why?" department: identifying prime numbers using regular expressions, explored with Python. [8422]
  • 2009-Aug-14: A look at some of the new features of Python 2.6. [8402]
  • 2009-Aug-06: Using a dictionary of functions as a switch/case statement rather than a long if/elif chain. [8371]
  • 2009-Jul-28: The singleton design pattern shows up in Python under the guise of modules, as modules are generally only imported once. This article talks about some of the difficulties with modules, especially when you need to reload them. [8343]
  • 2009-Jul-27: Printing the current stack in a running Python program is quite simple. [8335]
  • 2009-Jul-19: The strange case of what a return statement within a finally block does along with some oddities of exceptions being swallowed by finally blocks. [8299]
  • 2009-Jul-17: Some thoughts on enabling debug logging only when it is needed, and some after thoughts on how to do this better with contexts (i.e. the "with" statement). [8286]
  • 2009-Jun-19: Python Exception Handling Techniques by Doug Hellmann is a good discussion of Python exception coding. [8170]
  • 2009-Jun-18: Remember the in operator can be used in a number of ways. [8165]
  • 2009-Jun-16: Python Threading is Fundamentally Broken, the GIL gets examined in detail and is found to encounter significant performance issues on multi-core CPUs when threading is used. More followup on this comparing CPython to Stackless. And another round of followup on this. This article (referenced in one of the comments) shows how raising the sys.checkinterval to a value much larger than the default of 100 can greatly improve the situation. However, it still does not allow a multiprocessor machine to solve a problem in half the elapsed time by using two threads at once instead of a single thread. [8137]
  • 2009-Jun-12: Safely using destructors in Python, discusses the use of the __del__ method and weakref module. [8135]
  • 2009-May-25: Pipe Fitting with Python Generators talks about using a pipe class along with generators and filters to examine a problem space. [8032]
  • 2009-May-22: A overview of using factory functions. [8022]
  • 2009-May-22: Fetching docstrings from Python objects. [8021]
  • 2009-May-19: A discussion of some warts that exist in the conversion of floats to string form. The real issue here is that the str() and __repr__() functions of a float do not return strings formatted to the same degree of precision. str() which is used by "print" returns less precision than __repr__(), as can be seen here:
    x=123456.7890123456
    x
    123456.7890123456
    str(x)
    '123456.789012'
    x.__repr__()
    '123456.7890123456'
    x.__str__()
    '123456.789012'
    print x
    123456.789012
    
    [7997]
  • 2009-May-15: Is Python's limited access control to class members a good thing or a bad thing? [7981]
  • 2009-May-01: Are closures a good or useful thing to have in Python? [7927]
  • 2009-Apr-28: In Obscure Magic Methods and Pickle Support the author takes a look at some of the magic methods which are used behind the scenes on your behalf by the Python interpreter when doing things like iteration and pickling. IronPython in Action has an appendix of all the Python Magic Methods. [7137]
  • 2009-Apr-23: A tutorial on coroutines in Python was presented at PyCon 2009 and found to be quite good. Another programmer recommends this tutorial. [7809]
  • 2009-Apr-14: Writing a package in Python is worth a read. [7868]
  • 2009-Apr-08: yield statements can have some unexpected side-effects. [7849]
  • 2009-Apr-08: In Mixins considered harmful Michele Simionato attempts to make a case against using mixins; however, I think what he's really arguing against are large class hierarchies. Mixins are really just multiple inheritance, and ideally they just introduce new methods in an orthogonal fashion. In the Java world they split the idea of inheritance in a rigid (but useful) fashion by introducing the concept of an interface, which lets you add a useful set of methods to a class to give it certain behavior (but without introducing new member variables). To me a well designed class hierarchy presents classes that are useful at every level, so if you learn how to use the core functionality you can reuse this knowledge when using any derived class, and you only need to concern yourself with learning what's new and useful about each new derived class. I prefer to see classes that are intended to be used as mixins to be small, self-contained, tools that are intended to provide just a specific function or service. The second, third, and fourth articles in the series. [7400]
  • 2009-Apr-08: Python's module look up system may have a large performance impact, especially if you have a lot of eggs in the system [7845]
  • 2009-Apr-07: Pyload is a project to understand a module import system. [7839]
  • 2009-Apr-03: Could the Python import mechanism be broken beyond repair? I can't say I have really had to struggle with it to the extent this article describes, so I'm probably in the large group who find it does the right thing most of the time and in the rare cases it causes problems some thought (and perhaps recoding) is enough to avoid the issue. Still, I would not argue against a better design where even the edge cases work easily. [7823]
  • 2009-Mar-25: Some notes on using Python's list comprehensions, filter and map. [7790]
  • 2009-Mar-22: The use and abuse of keyword arguments in Python asks a few questions about what it makes sense to use kwargs for and warns of the legibility cost of doing so. [7761]
  • 2009-Feb-25: Decorators can be used to provide some aspect oriented programming features to Python. This example shows how to do this with decorators and context managers. [7646]
  • 2009-Feb-25: paycheck uses decorators to add tests to functions. [7645]
  • 2009-Feb-18: Explaining Why Interfaces Are Great attempts to explain why interfaces would be better than the new Python Abstract Base Classes (ABCs). I rather liked the way Java used interfaces but these articles are talking about a lot more complexity. [7608]
  • 2009-Feb-03: Python Threads and the Global Interpreter Lock. [7529]
  • 2009-Feb-03: An article on using the new "with" statement and the contextmanager decorator. [7528]
  • 2008-Dec-25: A few articles on plugins: [7368]
  • 2008-Dec-08: Python 2.6 adds the ability to execute zipped Python files which can make distributing applications easier. [7316]
  • 2008-Dec-08: Why explicit type checking is frowned upon in the Python world. [7310]
  • 2008-Dec-04: The Magic Sentinel is a nice explanation of how to provide a unique default value, rather than just using None. [7296]
  • 2008-Nov-26: Some examples of how to make your Python code look a lot like Lisp. Some further examples on how to change the syntax of logical operators as well. [7242]
  • 2008-Nov-10: Debugging Python regular expressions talks about using Kodos to do this. The comments also mention that the standard Python install includes
    /Tools/scripts/redemo.py
    which provides a Tkinter GUI for doing this sort of thing. [7183]
  • 2008-Nov-05: Some odd tricks with metaclasses and descriptors in Python. [7153]
  • 2008-Nov-05: The idea of soft-exceptions, or more properly adjustable exceptions. This is to make exception handling configurable at run time, allowing a program to log but proceed through some exceptions when it is in "soft" mode, but to stop when in "regular" exception mode. This can make debugging certain conditions easier. [7152]
  • 2008-Nov-04: Named Tuples, this is a class that allows you to access the contents of tuples as if they were objects with named attributes. A similar recipe provides records which are mutable. [654] [1]
  • 2008-Oct-28: Metaclasses in Five Minutes adds some evil to your Python coding skills. [7107]
  • 2008-Oct-15: The Method Resolution Order (MRO) for Python 2.3 talks about the details of how the interpreter picks a particular method when multiple inheritance is present. [7026]
  • 2008-Oct-07: Why there is no need for an unzip() function in Python. [6993]
  • 2008-Sep-30: In The Direct Attribute Configuration Pattern some of the ways that program configuration information is organized is discussed. [6949]
  • 2008-Sep-29: Thread synchronization and thread-safe operations, has links to some good articles (here and here) that discuss the fine points of these topics. [6947]
  • 2008-Sep-28: strongbox adds smart data objects with run-time type checking to Python. [6942]
  • 2008-Sep-24: Using the itemgetter and attrgetter functions from the operator module. [6908]
  • 2008-Sep-13: Different ways of working with multi-line long strings in Python. [6845]
  • 2008-Sep-08: Why your main program should be importable talks about a bit of Python style that is not obvious at first sight. [6816]
  • 2008-Aug-26: Some discussion on pylint, pyflakes and using the compile module to attempt to check object types in Python. [6744]
  • 2008-Jul-25: Many ways to count items in a list. [6582]
  • 2008-Jul-16: Handy Python one-liners for sya-admin type tasks. [6532]
  • 2008-Jun-23: Extending the pdb to give it syntax colour highlighting and tab completion. [6411]
  • 2008-Jun-18: Wrapping import calls in a supervisory function can be useful if you have certain dependencies and versions to check for. [6372]
  • 2008-Jun-18: A compile decorator (code here) to allow for easy use of PyPy's translator module to compile Python code for faster execution. [6371]
  • 2008-Jun-15: Some comments on using Pylint. [6346]
  • 2008-Jun-02: Common design patterns in Python, references this article which gives examples of iterators, decorators, factories, states and templates. There is also a link to a video of a Google talk on the subject. [6269]
  • 2008-May-09: How default arguments to functions get their values is rather interesting. Take a look at the spam() example to see how strange this can get. [6083]
  • 2008-Apr-25: Generator Tricks for Systems Programmers (recommended here) is a PyCon'08 tutorial presentation that covers using generator functions and expressions with examples applicable to systems programming. [5960] [1]
  • 2008-Apr-19: A discussion of whether threads or subprocesses are better. [5934]
  • 2008-Apr-03: How to list classes, methods and functions in a module using the inspect module. [5384] [1]
  • 2008-Mar-19: This set of links to resources from the PyCon 2008 talks has a fair bit of information about distributed computing topics. [5309]
  • 2008-Mar-19: Introspecting call arguments, a recipe that implements in Python the algorithm the interpreter uses for passing and assembling the arguments to functions. [5307] [1]
  • 2008-Mar-17: This lightning talk: Python Concurrency from PyCon 2008 by Jesse Noller gives an overview of the four major packages that are currently available for multi-processing from within Python. [5298]
  • 2008-Mar-17: A talk from PyCon 2008 by Alex Martelli called Callback Patterns and Idioms in Python is a good read on the sort of things that can be done with callbacks. [5296]
  • 2008-Mar-11: Some notes on using the eval() function. [5252]
  • 2008-Mar-03: This recipe allows you to create a restricted python function from a string, the intent being to allow an application to be safely scripted by user written functions in a controlled fashion. A cautionary follow-up to this which mentions that the rexec function is known to have security issues and is being removed from Python. [5206] [1]
  • 2008-Feb-22: This recipe uses PEP 263 to extend the syntax of the Python language. [5141] [1]
  • 2008-Feb-20: Some background on hash functions and the behavior of Python's built in hash function. [5128]
  • 2008-Feb-19: Some detailed notes about how finalizers are called and work in Python. [5122]
  • 2008-Feb-12: Enumerating Trees to examine postfix expressions. [5082]
  • 2008-Feb-07: Dr. Dobb's looks at Concurrency and Python. [5059]
  • 2008-Jan-30: MonkeyPatching a Python program can be done by adding extra code to the __init__.py file. More discussion of this can be found here. [5021]
  • 2008-Jan-24: A file contents caching system that appears as a dictionary where setting new values into the dictionary cause them to be written to disk file storage and reading values from the dictionary cause them to be loaded from the dictionary unless the key is not present, in which case they are fetched from disk. [4987] [1]
  • 2008-Jan-11: An article about a simple way to implement a plug-in (plugin) system to allow user-written extensions to software. [4651] [1]
  • 2007-Dec-19: A short recipe that illustrates how to create an n-dimensional dictionary where you can set a new key-value pair at an arbitrary level of nesting with simple code by using the __getitem__() and setdefault() functions. Note: this is best used by assuming your object tree is always N-dimensional and that N is constant for all values, otherwise you may encounter an error. Consider:
    >>> class auto_dict(dict):
    ...     def __getitem__(self, key):
    ...         return self.setdefault(key, self.__class__())
    ...
    >>> d = auto_dict()
    >>> d["foo"]=1
    >>> d
    {'foo': 1}
    >>> d["bah"]["bug"] = 2
    >>> d
    {'foo': 1, 'bah': {'bug': 2}}
    >>> d["bah"]["bug"]["big"] = 3
    Traceback (most recent call last):
      File "", line 1, in ?
    TypeError: object does not support item assignment
    >>> d["xbah"]["xbug"]["xbig"] = 3
    >>> d
    {'xbah': {'xbug': {'xbig': 3}}, 'foo': 1, 'bah': {'bug': 2}}
    >>> d["xbah"]["bug"]["big"] = 3
    >>> d
    {'xbah': {'xbug': {'xbig': 3}, 'bug': {'big': 3}}, 'foo': 1, 'bah': {'bug': 2}}
    >>>
    
    [4472] [1]
  • 2007-Dec-17: If you put a return statement inside a finally block the exception that triggers the finally block will be swallowed. [4437]
  • 2007-Nov-02: Using the ANTLR parser generator to create Python code for parsing discussed here and here. [3919]
  • 2007-Aug-24: Languages max(expressivity), a discussion of the list map function. This follows up on this article that compares Java to Scheme in this area. [575]
  • 2007-Aug-24: Examples of programming the consumer-producer paradigm in Python [574]
  • 2007-Aug-24: Using mmap to memory map a file of zip codes for faster searching [573]
  • 2007-Aug-24: In Python 2.2 a super() function (similar to what Java has) was introduced. [572]
  • 2007-Aug-24: An overview of the for-in statement. [571]
  • 2007-Aug-24: A explaination of how and why to use the new "with" statement in Python 2.5. [570]
  • 2007-Aug-24: Why one might use "foo is not None" is discussed in this article: Identity Comparision vs. Comparing Identities. Some more thoughts on when not to use "is" to compare things. [569]
  • 2007-Aug-24: Python has some language features that one might never use, but then one day you come across them in some other code and need to know what they do. The following fragments (see the reply by Xoanan on this page) are quite useful:

    List Intersection

    To determine the intersection between two lists "list1" and "list2":
    intersection = filter(lambda x:x in list1, list2)

    List Union

    To determine the union between two lists "list1" and "list2":
    union = list1 + filter(lambda x:x not in list1, list2)

    List Difference

    To determine the difference between two lists "list1" and "list2":
    difference = filter(lambda x:x in list2, list1)

    List Disctinct

    To determine the distinct elements, those not in common between two lists "list1" and "list2":
    distinct = filter(lambda x:x in not list2, list1) + filter(lambda x:x in not list1, list2)

    Unique Elements

    A discussion of the various ways of extracting the unique elements from a list

    Flattening Lists

    From time to time you might encounter a list which contains some lists, and you want to flatten this into a single list of simple elements. This voidspace artical talks about two ways to do this. Perhaps the more readible method is with the nested list comprehensions:

    nested = [[1,2,3], [4,5], [6]]
    flatList = [x for sub in nested for x in sub]
    print flatList
    [1, 2, 3, 4, 5, 6]

    this works by the first "for" loop (for sub in nested) iterating over the top list and on each iteration picking up a sublist and placing a reference to it in "sub", then the second for loop (for x in sub) runs and picks up the selected sub list and iterates over it, placing each element of it in x. A list of all the individual values that x takes is built up by the [x ...] construct.  A limitation with this is that all the elements in the outer list must support iteration (i.e. be lists, tupples or something else list-like), so you cannot have a simple scalar element in the outer list. [568]
  • 2007-Aug-24: Function dispatch tables. The general idea is that as functions are just objects, you can store them in a list and then select a function by index and call it. Of course this works with dictionarys too, so you can also arrange to call functions dynamically by name. I had always thought that when you did this you always needed to use the "apply" function, but it can be done without this even in Python 1.5.2.

    >>> def foo(a,b):
    ...   return a*b
    ...
    >>> foo(2,3)
    6
    >>> def boo(a,b):
    ...   return a/b
    ...
    >>> boo(2,3)
    0
    >>> boo(2,3.0)
    0.666666666667
    >>> a=[foo, boo]
    >>> a[0](2,3)
    6
    >>> a[1](2,3)
    0
    >>> a[1](2,3.0)
    0.666666666667
    >>> d = {"mul":foo, "div":boo}
    >>> d["mul"](2., 4.)
    8.0
    >>> d["div"](2., 4.)
    0.5
    >>>


    [567]



              back to vermeulen.ca home