Mathics goes open source

I’ve been working a lot on Mathics again during the last weeks. All towards the goal that I’ve had in mind for a long time: to release it as open source. After all, that’s the only thing that could set Mathics apart from Mathematica—the only thing that’s not so good about Mathematica is that it’s not free, neither as in freedom nor as in free beer.

If you wonder what Mathics is: it’s a general-purpose computer algebra system implementing the awesome Mathematica mathematics/programming language. Some of its most important features are

  • a powerful functional programming language,
  • a system driven by pattern matching and rules application,
  • rationals, complex numbers, and arbitrary-precision arithmetic,
  • lots of list and structure manipulation routines,
  • an interactive graphical user interface right in the Web browser using MathML (apart from a command line interface),
  • creation of graphics (e.g. plots) and display in the browser using SVG,
  • an online version at www.mathics.net for instant access,
  • export of results to LaTeX (using Asymptote for graphics),
  • a very easy way of defining new functions in Python,
  • an integrated documentation and testing system.

Read the manual to learn more about the over 350 built-in functions and symbols in Mathics.

The actual heavy math stuff (like integration) is mostly done by the Python package SymPy. There is optional support for functions depending on Sage as well. Despite “out-sourcing” most mathematical functions, Mathics has more than 20,000 lines of Python code already, dealing with much non-trivial stuff such as parsing Mathematica input, pattern matching, graphics generation, etc.

Unfortunately, Firefox is the only browser supported so far, because no other browser supports MathML yet. However, this is expected to change pretty soon when Webkit (used by Safari and Chrome) adds MathML support. I wonder whether Internet Explorer will ever get that far.

I set up a project homepage at www.mathics.org for organizational stuff, while www.mathics.net is still the place for the online interface of Mathics. The source code is hosted at github.

I hope that one day there will be developers joining me. Contact me if you want to get involved!


TwitterExplorer

Finished my work on TwitterExplorer (as far as work can ever be finished). It now features a huge, interactive clustered network of Twitter #hashtags. There are detailed statistics, timelines, and a classification (attempt) for each hashtag, plus a separate clustered network of the 3-hop network around it.

This is one more Python/Django application (for details, see the about page). Especially the interactive graph involved a lot of JavaScript tweaks with the JavaScript InfoVis Toolkit, although I’m not really sure whether it was really worth using it—in the end, I might have needed less code by starting from scratch, using just jQuery and and the HTML5 canvas element.

This is also the first time I placed a flattr button somewhere. Let’s see how much it generates—I don’t have any expectations.

iPhone SDK for MacOS X 10.5 Leopard

Thanks to Drop The Nerd, found a way of installing a version of the iPhone SDK (3.1.3) that still supports Leopard: Download from apple.com. Maybe I should’ve already switched to Snow Leopard, but actually I’m not missing much in Leopard (at least not much that Snow Leopard provides, I think), so I feel like I can still wait for “Lion”.

Now I will first create a GUI mockup for our software engineering project (“car sharing application”), and then, some day, there might be a tripedia iPhone app…

Code line count

I just asked myself the question, how much code have I written so far in my life? I wanted to break it down into individual projects and programming languages. A shell script using wc -l could have done it, but I decided to write a short script in my “mother tongue” Python.

To setup projects and programming languages, the following variables are used:

PROJECTS = {
	'tripedia.org': '/Users/Jan/Projekte/tripedia.org/tripedia',
	'-BatchWorks': ['/Users/Jan/Projekte/BatchWorks/Src', '/Users/Jan/Projekte/BatchWorks/Components'],
	'-Mathador': '/Users/Jan/Projekte/Mathador',
}

EXCLUDE_DIRS = set(['_Kopien_', 'prototype', 'scriptaculous', 'innerdom', 'livepipe', 'gmapsutil', 'TBP', ])
EXCLUDE_FILES = set(['prototype.js', 'carousel.js', 'printf.js'])

TYPES = {
	'Pascal': 'pas',
	'C/C++': ['h', 'hpp', 'cpp', 'c'],
	'HTML': ['html', 'htm', 'xhtml', 'xhtm'],
	'CSS': 'css',
	'Python': 'py',
	'Java': 'java',
	'JavaScript': 'js',
	'PHP': 'php',
}

A minus at the beginning of a project name indicates that this is an old, discontinued project. An inverse dictionary of file types to programming languages is built using the following code:

def flatten(seq):
	""" flattens lists and sequences to one dimension """

	if isinstance(seq, (list, types.GeneratorType)):
		for item in seq:
			for sub_item in flatten(item):
				yield sub_item
	else:
		yield seq

TYPES_INV = dict(flatten(((ext, type) for ext in ([exts] if isinstance(exts, basestring) else exts)) for type, exts in TYPES.iteritems()))

The core of the script is the following loop over projects, directories, and files:

result = {}
for project, dirs in PROJECTS.iteritems():
	if isinstance(dirs, basestring):
		dirs = [dirs]
	lines = {}
	for dir in dirs:
		for dirpath, dirnames, filenames in os.walk(dir):
			for exclude in EXCLUDE_DIRS:
				try:
					dirnames.remove(exclude)
				except ValueError:
					pass
			for filename in filenames:
				if filename not in EXCLUDE_FILES:
					basename, ext = os.path.splitext(filename)
					ext = ext[1:]	# remove leading '.'
					type = TYPES_INV.get(ext)
					if type is not None:
						lc = linecount(os.path.join(dirpath, filename))
						inc_dict(lines, type, lc)
result[project] = lines

The actual line counting for a single file is done by the following very simple function:

def linecount(filename):
	""" determine the number of lines in a file """

	return sum(1 for line in open(filename, 'r'))

There are probably faster methods, but this is just very easy and “pythonic”, I think. The function inc_dict simply increments a dictionary entry:

def inc_dict(d, k, i):
	""" increment the dictionary entry d[k] by i, or set it to i if not present already """

	try:
		d[k] += i
	except KeyError:
		d[k] = i

Finally, a table with all the line counts is produced:

types_total = dict((type, sum(lines.get(type, 0) for project, lines in result.iteritems())) for type in TYPES)
types = sorted(type for type, count in types_total.iteritems() if count > 0)
table = []
table.append(["Project"] + types + ["Total"])
for project, lines in sorted(result.iteritems(), key=lambda (p, l): project_key(p)):
	table.append([project.lstrip('-')] + [lines.get(type, "") for type in types] + [sum(lines.values())])
table.append(["Total"] + [types_total[type] for type in types] + [sum(types_total.values())])
print html_table(table)
print text_table(table)

The functions for outputting the table in HTML and text format are also pretty simple:


def html_table(table):
	html = ["<table>"]
	html.append("<thead><tr>" + "".join(("<th>%s</th>" % cell) for cell in table[0]) + "</tr></thead>")
	html.append("<tbody>")
	for row in table[1:]:
		html.append("<tr><th>%s</th>" % row[0] + "".join(("<td>%s</td>" % cell) for cell in row[1:]) + "</tr>")
	html.append("</tbody>")
	html.append("</table>")
	return "\n".join(html)

def text_table(table):
	row_widths = []
	for row in table:
		for index, cell in enumerate(row):
			if len(row_widths) <= index:
				row_widths += [0] * (index + 1 - len(row_widths))
			row_widths[index] = max(row_widths[index], len(unicode(cell)))
	text = []
	for row in table:
		text.append(' '.join((('%' + str(row_widths[index]) + 's') % cell) for index, cell in enumerate(row)))
	return "\n".join(text)

You can also download the whole Python script.

Finally, here is my result:

Project C/C++ CSS HTML JavaScript PHP Pascal Python Total
Auro Kubelka 599 19 13350 13968
Exkursionsbauernhoefe 192 3357 3549
Mathics 827 489 1971 24215 27502
oekosozialmarkt.com 3019 9201 2685 24443 39348
Rindfleischfest 240 966 1206
Stanzoptimierung 7976 7976
tripedia.org 1881 7542 4117 17590 31130
BatchWorks 8325 8325
livelynet.net 296 3379 357 8409 12441
Mathador 8857 343 9200
Preisdetektiv 119 277 728 952 2076
RLS-Info 362 23098 325 23785
Total 16833 7535 44329 10202 17673 8325 75609 180506


Over 180,000 lines of code written so far! And that includes only most of my major projects. Smaller stuff written for university assignments (especially much Mathematica code), small tools and tests are not included in this statistic.

Stanzoptimierung

Finished the optimization project I’ve worked on the last few weeks: Together with Eranda Dragoti-Cela, Elisabeth Gassner, Johannes Hatzl and Bettina Klinz I created a C++ program that optimizes the configuration and punching plan of a punching machine. The machine will start to operate in a Polish factory this summer. Below are some pictures of it.

Punching machine 1

Punching machine 2

Punching machine 3

Rindfleischfest

The homepage about the Styrian “Rindfleischfest” for the Chamber of Agriculture is finished and online under www.rindfleischfest.at. It’s a rather simple, but beautiful and hopefully informative PHP website presenting the schedule of the festival and various companies and organisations participating.

Rindfleischfest

New design

Right on time for Christmas, tripedia.org has got its new design and layout! Looks quite sexy, doesn’t it? ;-)

To illustrate the development of tripedia’s style, there are some screenshots from October 2007, March and November 2008, and now.


(Oct 07)


(March 08)


(Nov 08)


(Dec 08)

Moreover, I’ve completely rewritten large parts of the code and “refactored” many features. There are also some new features, e.g. a small map for every photo where you can specify where it has been taken. This particular feature will be extended very much in the future.