Archive for the ·

Programming

· Category...

Java 8: File handling becomes fun

Comments Off

With the introduction of functional-style syntax and several new Stream based methods in Java 8, file handling finally becomes fun and easy. In fact, the new helper methods and lambda syntax even gives Python a run for its money when it comes to compact code.

Here is how you could read all lines of a file, given as a Path p (since Java 7), and output to stdout.

Files.lines(p).forEach(System.out::println);

To make it a bit more clear what is going on, here a bit more is included and broken up.

Path p = Paths.get("myfile");
 
Stream lines = Files.lines(p);
lines.forEach(System.out::println);
lines.close();

A similarly neat helper function exists for recursively walking over the directory tree of the file system. Again, this prints to stdout.

Files.walk(p).forEach(System.out::println);

Comments Off

UTF8 in MySQL

Comments Off

For some unknown reason, the default character set in Debian 7 (“Wheezy”) is different to Ubuntu 12.04 (“Precise Pangolin”). The former uses latin1, while the later utf8. (This could also be down to other local configuration I’m not aware of). The difference between the two is rather subtle, and might go unnoticed. One visible effect is when using letters with German umlaut or Scandinavian specific letters in ordered text. For example, the Norwegian letter Å (aring in HTML) is the last of the alphabet, but when using latin1, it is interpreted as an A (or possibly double a: aa) and ordered first. See this bug report and discussion for more details and examples.

The character set is specified in multiple places in the MySQL DBMS: On the database, table and for the client, server, connection, result set and underlying system. To view the current settings, these two commands give an overview (see here for details):

SELECT * FROM information_schema.SCHEMATA;
SHOW VARIABLES LIKE 'char%';

The output will be something like:

+--------------+--------------------+----------------------------+------------------------+----------+
| CATALOG_NAME | SCHEMA_NAME        | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH |
+--------------+--------------------+----------------------------+------------------------+----------+
| def          | information_schema | utf8                       | utf8_general_ci        | NULL     |
| def          | test               | latin1                     | latin1_swedish_ci      | NULL     |
| def          | myTESTdb           | latin1                     | latin1_swedish_ci      | NULL     |
+--------------+--------------------+----------------------------+------------------------+----------+
 
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

The above is from the Ubuntu 12.04 system where there were no problems with the character set of text values in the tables. Notice that are some latin1 settings there, but that seems to be fine. The difference on the Debian 7.0 system was that these variable were set to latin1: character_set_client, character_set_connection, character_set_results. One way to change them is through the SET commands in MySQL console:

SET character_set_connection = utf8;

However, that will not persist the settings. Furthermore, it will not be enough to fix the tables. Instead, the config file /etc/mysql/my.cnf has to be changed with the following lines. (from this discussion).

[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8

Finally, after that is done, the MySQL server restarted, and the data with the special UTF8 characters have to be re-inserted. That is a bit of a pain, and I did not dig too far into how to resolve a problem with existing that. There is a discussion on how to alter the DB and tables here, but I suspect the data will still contain the wrong characters, so it will not be enough. Fundamentally, the byte representation in lation1 and utf8 for these letters are different, so that has the be replaced somehow, as far as I understand.

An alternative to all this is to get it right from the get-go. The MySQL documentation outlines how a new DB can be created with a specific character set. Again I’m not sure if that is enough, as the settings in the my.cnf file deal with the encoding of the client, connection, and result. There are probably several ways to solve the issue, though.

Comments Off

Touch events in JavaScript

Comments Off

Looking to extend my “HTML5 Canvas Panorama Viewer” to make it work on mobile phones, I’ve started to dig into the wonderful world of incompatible touch events between all the browsers. It’s IE vs. WebKit vs. Gecko vs. Opera, or back to the 90s in other words.

Peter-Paul Koch site quirksmode does a good job of summarising the various aspects of mobile web development.  In particular, the touch table is relevant to what I wanted to achieve. However, it does not look too promising right now, as most events are listed as either no supported, or incomplete.

Still, an old tutorial by nroberts, “Touching and Gesturing on the iPhone” gives hope. It includes a small example (http://tinyurl.com/sp-iphone) which actually do work to some extent on both the Android  native browser, and Firefox 6.

Now the challenge is to combine it all, and have both desktop and mobile browser behave in an expected and functional manner. Using the same events for the same actions does probably not make sense, however the user should be able to achieve the same across all browsers. The panorama viewer needs only three actions: zoom in, zoom out, and move. Zoom in already works, move works but is buggy, and some touch gesture must replace right-click to zoom out. I’ll come back to this later.

Comments Off

Programming Books

Comments Off

StackOverflow recently voted on the “most influential programming book”. The Internet Security Blog brings a neatly formatted list. Finally, the question was repeated on Slashdot; which is of course like asking a class of ten year olds about their favourite bubble gum. Surprisingly, some interesting information could be extracted from the noise.

The top ten:

Code Complete (2nd Edition)
The Pragmatic Programmer: From Journeyman to Master
Structure and Interpretation of Computer Programs (2nd ed.)
The C Programming Language (2nd ed.)
Introduction to Algorithms
Refactoring: Improving the Design of Existing Code
Design Patterns: Elements of Reusable Object-Oriented Software
The Mythical Man-Month: Essays on Software Engineering
Art of Computer Programming, Volume 1: Fundamental Algorithms (3rd Edition)
Compilers: Principles, Techniques, and Tools (2nd Edition)

Other

The C Programming Language
Design Patterns
Deitel & Deitel
The Mythical Man-Month
John von Neumann: Theory of self-reproducing automata
The Art of Unix Programming
Starting Forth by Leo Brodie
Peter Norton’s Guide to Programming the IBM PC
Bruce Eckel’s Using C++ and Thinking in C++
The Design of Everyday Things by Donald Norman
Algorithms + Data Structures = Programs by Niklaus Wirth

And there’s even more.

Comments Off

Java 7 Released

1 comment

Oracle (finally) released Java 7 today, 4 years and 7 months after the initial release of Java 6. The previous version updates used to be on a 18 months schedule, so that makes this release 37 months late. It’s getting a pretty good bashing on Slashdot, primarily for its current owner, Oracle.

New features includes syntactic sugar like Strings in switch statements; underscores in numeric literals; and type inference for generic instance creation. The Open JDK site lists the same features, so I’m guessing that means everything is open sourced already. However, it seems smaller items on my wish-list did not make it, like the Immutable annotation. (The JSR 305 has been inactive for ages, and still the only implementation is a Google Code hosted project.)

The new Java Doc is here, and latest download from Oracle here. The Open JDK binaries are also available in the repositories of major distributions.

yum install java-1.7.0-openjdk java-1.7.0-openjdk-devel

or
sudo apt-get install openjdk-7-jre openjdk-7-jdk

Sparklines

Comments Off

Looking for a Python sparkline library, I found Perry Geo’s excellent code. “In the minimalist spirit of sparklines, the interface was kept simple”:

import spark
a = [32.5,35.2,39.9,40.8,43.9,48.2,50.5,51.9,53.1,55.9,60.7,64.4]
spark.sparkline_smooth(a).show()

That’s it, and here’s the result. Just download his single Python module, start up interactive Python, and off you go.

This of course sent me on a tangent, off to Edward Tufte’s work and creation of sparklines. It seems I have a book or two to buy.

Comments Off

The Art of Computer Programming, Volumes 1-4A Boxed Set

Comments Off

The forth volume of Donald Knuth’s The Art of Computer Programming is now ready for print, and can be pre-ordered at Amazon. At about $200, I think I might add it to my next order.

Comments Off

Java Applet Test

Comments Off

Mostly for my own reference: An old Java Applet which I continue to use to verify that Java Applets works. The applet itself is demonstrating Casteljau’s algorithm to draw a Bézier_curve.

You can move the red control points, and then animate the drawing of the curve.

Comments Off

Back references in Java regexp

Comments Off

The Java API documentation for regular expressions says that:

Back references
\n Whatever the nth capturing group matched

This seems to be difficult to get working though. Here’s an example of a work-around:
str.replaceAll(“user(.*)”, “$1″)

Example:
“username” => name

Comments Off

A note about multiple inheritance in Python

Comments Off

Consider the following three classes, where C inherits from A and B, and the method m() is present in both A and B.


class A(object):
def m(self):
print "a.m"

class B(object):
def m(self):
print "b.m"

class C(A, B):
def test(self):
pass

Now, if you create an instance of class C and invoke method m, what will be the result?


>>> c = C()
>>> c.m()
a.m

The Python docs tutorial on the subject says “the resolution rule used for class attribute references (…) is depth-first, left-to-right.”, which seems like an easy rule. However, it also warns about some maintenance nightmares, e.g “a class derived from two classes that happen to have a common base class”.

Comments Off

Class attributes in Python

Comments Off

After almost two hours of debugging, I finally made an interesting (at least to me) discovery in Python: Attributes of type dictonary and list (and presumably all other object types) at the class level are “static” (as used in Java classes), while primitive types are not. Does that make sense? I don’t know.

Consider the following dummy class:

class MyClass:
e = {"a":None}
f = [0]
i = 50
def __init__(self):
self.d = {"a":None}
def set(self, v):
self.d["a"] = v
self.e["a"] = v
self.f[0] = v


We have four attributes here:
e – a class level dictionary
f – a class level list
i – an int
d – another dictionary referenced by self.d inside the methods.

Now consider the following result from these interactive lines:

>>> x = MyClass()
>>> x.set(5)
>>> x.d
{'a': 5}
>>> x.e
{'a': 5}
>>> x.f
[5]
>>> y = MyClass()
>>> y.set(10)
>>> x.d
{'a': 5}
>>> x.e
{'a': 10}
>>> x.f
[10]
>>>
>>> y.i = 100
>>> x.i
50
>>> y.i
100
>>>


As you can see, the first call to set method, set the value 5 in x.e, x.f and x.d. However, the second call, y.set(10), will also set the value of x.e and x.f (in red). Finally, it is shown that the int i does not conform to this behaviour.

Strange, I say.

Comments Off