32 allows alternative binary bindings body both cloud community dec document download easier easily edu elements examples exist general interesting jan libraries looking minutes n non o pack performance popular problem processing pub quickly reader reading recipes right searching show similar solution standard supposed table through title updated width writing yet
|
Python is can be a very good language for processing XML formatted files.
- 2009-Nov-13: Beautiful
Soup, is a Python HTML/XML parser for projects like screen-scraping. It is available here too. An example of doing some markup massage to clean up problematic HTML prior to running Beautiful Soup on it.
[231] [1]
- 2009-Jul-15: bier-soup.py is an example of reading in html tables and processing them to text files with BeautifulSoup. [8265] [1]
- 2009-May-01: pyxser a Python object to XML serializer. [7929] [1]
- 2008-Dec-11: Ian Bicking suggests that lxml is a good alternative to BeautifulSoup for web scraping tasks. He also has some short examples including an HTML diff. [7336] [1] [2]
- 2008-Oct-28: gxml is a module providing a common interface to some of the popular XML libraries. [7109] [1]
- 2008-Oct-11: A patch for ElementTree to better support the CDATA section. [7006] [1] [2]
- 2008-Jul-09: lxml, discussed here,
provides Python bindings for the libxml2 and libxslt libraries.
O'Reilly discusses libxslt here. How lxml
and ElementTree compare, and why they both exist. The PyPi page for it is here
[220] [1]
- 2008-Jun-21: Converting XML to Dictionary and Back, the intent of this is to make working with XML structured data easier for the Python programmer. While this is not a general solution for all XML data files it may be useful for things where the structure and content is more restricted, like a configuration file. [6400] [1] [2]
- 2008-Mar-31: Looking at the performance of various HTML parsers for Python (lxml, BeautifulSoup, html5lib, ElementTree, cElementTree, HTMLParser, htmlfill, Genshi, xml.dom.minidom). [5363] [1]
- 2008-Mar-30: xmlpolymerase is a Python object serializer that will pack to and unpack from XML. Sort of an XML version of Pickle. [5355] [1]
- 2008-Feb-25: openxmllib is a module for working with OpenXML documents. [5158] [1]
- 2008-Feb-05: PyXML, XML Parsers and API for Python, the project home page is here. [5050] [1]
- 2008-Jan-10: Why binary-XML is solving the wrong problem. [4641] [1]
- 2007-Oct-22: XML to Python Data Structure, this allows an XML object to be easily accessed as a python object (with some minor limitations). [3500] [1] [2]
- 2007-Aug-31:
Gnosis
Utilities, a collection of utilities
for working with XML documents. It includes some other things like full
text indexing and searching, Python object introspection, hashcash and
spam filtering.
[227] [1]
- 2007-Aug-31:
PTML, for
embedding Python into text documents
[229] [1]
- 2007-Aug-24:
simplexml
another XML file manipulation library for Python
[216] [1]
- 2007-Aug-24:
xmlmodel,
allows you to expressively define an XML document, using native python
classes, you can then access the elements of the XML through a tree of
native python objects.
[217] [1]
- 2007-Aug-24:
mlk_xhtml,
a package for creation of XHTML.
[218] [1]
- 2007-Aug-24:
ElementTree
is a Python XML reader/parser/writer that's been implemented in both pure
Python and C. Using non-standard
encodings in cElementTree. Here is a talk on using
ElementTree to process XML. A recommendation
for ElementTree.
[219] [1]
- 2007-Aug-24:
xmltramp,
makes reading XML data easy
[221] [1]
- 2007-Aug-24:
PySimpleXML,
(and here) simplifies
the translation betweeen Python structures and XML
[222] [1]
- 2007-Aug-24:
pyxmlserial,
XML serialization of basic Python data.
[223] [1]
- 2007-Aug-24:
YAXL, is Yet
Another (Pythonic) XML Library, one of the design goals being that it can be
understood in 15 minutes
[224] [1]
- 2007-Aug-24:
A look at Python-based DOM Manipulation templating systems
[225] [1]
- 2007-Aug-24:
pyfo, a module for
quickly generating XML representations of Python objects.
[226] [1]
- 2007-Aug-24:
surely, a program to
convert files written in a shorthand notation (similar to Python
syntax) into XML
[228] [1]
- 2007-Aug-24:
pullparser,
a simple module for HTML parsing, supposed to be easier to use that the
HTMLParser module for some things.
[230] [1]
- XIST
a python framework for reading and writing XML.
[215] [1]
|