Sunday, July 23, 2006, 05:39 PM
- PhD, AI
I finally got around to reading Brooks' intelligence without representation, given to me by Frank ages ago. The paper discusses an interesting architecture of intelligent programs that does not rely on a representation of the external world (etc. read the paper). Which is interesting... BUT more interesting I found the discussion of how many AI researchers deceive themselves by using overly simple scenarios for their experience, either virtual words such as box-world, or even simplified versions of the real world, with matte walls, colour coded object etc. (There is a related, but kind of opposite argument made by Hofstadter, which I wont go into here)
Brooks argues that the only way to develop intelligent systems is
[...] to build completely autonomous mobile agents that co-exist in the world with humans, and re seen by those humans as intelligent beings in their own right.
These claims were made in 1987 and in the last 20 years the internet has brought us a completely new "real" world where many people spend hours every day. We now have a complex, and for any human use practically infinite world of things to interact with.
The internet removed one layer of the difficulties of perception: the need to interpret the not very well understood, noisy, high-bandwidth channels of sound and vision was removed. Instead an intelligent "creature" (to borrow Brooks' terminology) can work on textual documents, whereas still noisy, at least the understanding of natural language seems slightly easier that image understanding.
Perhaps with the advent of the Semantic Web the life of AI researchers has become much easier again. What previously was an unrealistic "abstraction" of the problem, i.e. ignoring the text-parsing and understanding problems, claiming that someone else would solve this and that our work takes the already extracted semantic content as input, has now become quite a reasonable argument.
I suppose what I'm saying at the end of the day is "Thank you Semantic Web people", you have made it possible for me to work on what I grandiosely call an "intelligent" system, without having to solve ALL the problems!
[ add comment ] | permalink |




( 3 / 2769 )Thursday, July 6, 2006, 04:37 PM
- Python
Ever since I got my mac I've had endless problems with tagging (man, that word really is getting overloaded these days) my music collection. Not only did iTunes forget most of the tags I had lovely created in QuodLibet (a far superior tagger/music library/player), but also it did not support multiple values for a single tag (for instance a song by Miles Davis AND John Coltrane?). Back then I was sort of saved by some applescript hacking, some patience, many hours of trying to face the fact that there were no good music players for MacOSX, and finally some effort to forget that I had previously lived in a perfect world... NOW, fast forward to earlier this week, I was reading about the British Recording Industry suing AllOfMP3, and a Norwegian newspaper writing that it was safe and sound for norwegians to buy music there still, (and maybe also the recent german crusade against illegal downloading...) So I decided to see what the fuzz was about and after I saw that they had several Ars Nova albums, which I have been unable to get hold of anywhere else in the world, I created an allofmp3 account and spent $10 to top up my balance... Shortly after I was the proud owner of "The Goddess of Darkness" and "The Book of the Dead", all for less than $3... Amazing. Unfortunately you have to download each song individually, which takes long if you get a bit shopping happy, downloading a zip of a whole album would make much more sense... I'm sure I could script the process though. (In doing this I found that each time I right-clicked and did save-as in firefox the whole application would freeze for 5s, googling a bit found that clearing the download list fixes this, because the list is kept in a file called "downloads.rdf", which is slow to update, *ahem*)
Now to the technical interesting bit: The MP3s I downloaded were 192kbps, with ID3 tag containing the Artist and song name, with filenames like this: 03-morgan_192_lame_cbr.mp3. Now the genre of all the songs I downloaded was set to "Blues", for no apparent reason, this held for Ars Nova, Bruce Springsteen and Røyksopp, but iTunes makes it easy to set the genre for a bunch of song. Worse, the ID3 tags did not contain track-numbers, and although this information is in the filename, iTunes has no easy way to extract this. QuodLibet/ExFalse of course does this very well indeed, but I've been unable to run them on MacOSX... BUT they are written in python, and the tag-handling is a separate library called Mutagen, and this runs nicely on MacOSX, so with 15 minutes I had my new MP3s tagged correctly! Hurrah.
The script is at http://www.dfki.uni-kl.de/~grimnes/2006 ... knumber.py if anyone cares! If I feel inspired some day I might write a more general regexp based one, indeed I could probably steal most of it from QuodLibet...
(A side note, I need to stop reading Hunter S. Thompson to get rid of these looong rambling sentences with no point, OR maybe I have to read more... )
[ 1 comment ] ( 31 views ) | permalink |




( 3 / 3461 )Tuesday, June 20, 2006, 10:26 PM
- PhD
This is the first real result from Smeagol, where it actually makes a plan to learn and does succeed! 
(click for readable version)
I realise that this diagram is probably completely incomprehensible to anyone but me, so here is a quick explanation:
1. The "top-level goal" given in this case is to answer the query:
[] ql:select ( ?s ) ;
ql:where {
?s a bibtex:InProceedings ;
?s foaf:maker ?a ;
?a pers:expertIn wiki:Semantic_Web
} ;
ql:results ?r .
i.e. give all papers written by Semantic Web Experts.
2. Smeagol first makes the trivial plan to read some data (there are many issues about this, I skip them all), and perform the query.
3. The trivial plan fails (otherwise this would be boring), because there are no people who are Semantic Web experts who have written any papers (in my extremely artificial hand-made dataset). Papers written by other people DO exist...
4. Several plans are made that may introduce more triples matching any of the patterns evaluated in the query before it failed. In this particular case my heuristic reordered the query to be [(?s a Inproceedings),(?a expertIn SemanticWeb), (?s maker ?a)], i.e. find all the people and papers first and do the join later, and it was the very last pattern that failed to match, therefore further results for any of these patterns could be useful.
5. The easiest (the actions have weights) plan is chosen first, this is read the ontology for the bibtex classes, and attempt to find more "things" that are "InProceedings" based on RDFS inference, maybe they are a only explicitly declared to be a subclass of InProceedings for example. Unfortunately, this is not the case, and the this plan fails.
6. Smeagol returns to the second easiest plan, this plan involves actual learning: Find the set of things that are already of type InProceedings => attempt to use ILP to learn a description of this set => use this description to classify further instances as InProceedings. Alas, this also fails, in this case it failed because it failed to find any good negative examples, but I wont go into that here.
7. Returning to the third easiest plan, this is using the same pattern as the previous plan, but finds a set of people who are Semantic Web experts instead. In this case the description learning DOES work and we learn the rule:
{ ?A <http://purl.org/swrc#conferenceChair> <http://iswc2004.org/> }
log:implies
{ ?A rabbi:learnedCategory "blah" }.
8. This rule is then used to classify more instances as Semantic Web experts, and since I very carefully constructed this examples it finds one! :)
9. After the sub-plan has succeeded, Smeagol returns to the previous failed plan and re-tries the failed action, and now the query works. Hurrah!
All in all this took a ridiculous number of lines of code, many hours of debugging, and the result still isn't very impressive, but hopefully I have ironed out most planning bugs now and I can get to work on creating the final set of examples that will finish off this PhD! :)
[ add comment ] | permalink |




( 3 / 1673 )Sunday, June 18, 2006, 08:28 PM
- Semantic Web
ESWC is now over and it was great fun. I might write something later about interesting talks I saw and people I met, but here is something about a quick hack I did during some less interesting talk :), this was also partly written while in Montenegro: ----
Since Tom Heath kindly made the effort to eat some dog food and produced lots of RDF data about ESWC, I thought I should be a good semantic webber and try to do something with this data. (I wont try to extend the dog-food analogy... )
To get all the data I started with this url and crawled seeAlso links from there, this made sure I would also get some FOAF data for individual users where their profile was linked. When I got tired of watching it crawl I had crawled a few hundred RDF files and some 10,000 triples.
So since I always do the same thing I applied my trusty all clustering and then ILP to learn descriptions of the clusters, at first I tried clustering the people, but got nothing interesting, I could recreate the organising committee and have all the other delegates cluster, or I could do just the delegates, and have 1 big cluster with most of them and a few people in clusters by themselves. Quickly giving this up I tried to cluster
the papers instead - this wasn't wildly successful either, but at least I recreated demo session, poster session and main conference papers groups as clusters, excluding a few papers, which were clustered on their own:
- WikiFactory: a web ontology-based application for creating domain-oriented wikis
- Using Semantics to Enhance the Blogging Experience
- Towards A Complete OWL Ontology Benchmark
If anyone knows why these are 'magic' please let me know :)
If you really care you can also look at the full results here
After looking at the properties of the papers in Tom's data, it was obvious that it would be hard to get anything more sensible out of this, so I gave this up as well and went back to paying attention to the talks.
The only good bits to come out of this was that I streamlined getting 'nice' html output from commandline, so I can easily do it later, and the lack of "topic" meta-data made me think of the application that in the end won me the iPod Nano in the ESWC Design Challenge! Yay!
[ add comment ] | permalink |




( 3 / 3661 )Saturday, May 20, 2006, 03:15 PM
- Python
Today's pointless hack was brought on by the fact that happened to be in possession of the complete dilbert comics up until 2005. Often I remember some particular strip, but have no way to sensible search thousands of GIF files. (I *could* join comics.com, but that would be less fun, and I already have nearly all the dilbert books, so this info is sort of 'mine' already, ahem). Also inspired by my recent re-reading of Hofstadter's Letter Spirit, I set to work. Note that there are good commercial solutions for this, and probably lots of well known algorithms etc., but I wanted to discover this myself. I did a quick google for tutorials on OCR, but nothing sensible came up.)So, looking at some strips, the dilbert text seems to have many nice features for automatic extraction:
- The font is always the same size/type
- the letters are all capitals
- the lines of text are always straights
- there is always white-space behind the text
- There is very little punctuation or digits.
- with some exceptions to all of these, but these bits i can live without:

So with the gimp, python, ImageMagick, Numeric and PIL I set to work. First I cut out one of each letter, and auto-cropped in the gimp:

The first plan was to scan every pixel of the big image for a match with every letter, a match being defined as the sum of the errors for the overlapping pixels being below a certain threshold. Since most dilberts (apart from sunday strips) are black and white anyway, I did everything in gray-scale mode and the pixel values are simply byte values. The threshold for a "match" was set at 40 after some trial and error.
This process was slow as cancer, even when I changed from python lists to Numeric arrays. (later it turned, using a flat list and computing the offset as y*w+x is actually quicker than Numeric arrays, even without psyco, odd...) The first attempt, for a single letter only, looked like this:

Tweaking the numbers for the "match" threshold and removing testing for overlapping matches etc. speeded it up a bit, and I soon discovered another problem. Here illustrated by looking for matches for i's in the first frame:

The problem was that comparing the tightly cropped I image was matching lots of sub-sections of other letters. Since the thing was still horribly slow I decided to try a slightly different approach. I would detect the base-lines of each line of the text in picture, again, this should be pretty easy apart from a few comics where there is content next to the text, but I would worry about that later. First attempt at finding lines that were largely empty:

then group together blocks of lines and remove the ones to close to the top to fit a whole letter:

Now I retried the above matching of letters, trying each letter on every possible X position along each line, then order them by X value.
This produces the first real result, i.e. it spat out some text:
vititingacuitmer
visiting a customer
odurodftfitoas]
our office was
iodesitgneodwithltfhe
designed with the
citenceocgepfecghuitl
science of feng shui
Correct answers in italics. Interestingly this produces more letters than in the original :) So I try grouping the letters that are duplicates, i.e. both were detected in the same place:
v<it><it>ing a cu<i t>mer
<od>ur <od>f<tf><it>o as
<iod>es<it>gne<od> w<it>h lt<fh>e
c<it>ence <ocg><ep> fe<cg> hu<itl>
It's not quite useful yet... even if I generated all possible words for each ambiguous character I wouldn't get anything sensible.
Also, there is clearly a big problem in recognising S's and in telling Is and Ts apart. Maybe I shouldn't disallow overlapping boxes... this made it slightly slower again, but didn't improve results.
So after two evenings of hacking I now had some random text that was completely useless. My instinct told me I had probably taken the simple sum-of-error approach as far as it would go, and now - there is a fork in the path:
1. Keep considering the letters independently, but let the program learn, i.e. sit through a few sessions of : "i reckon this is an 'S', no that's a 'Z', try harder.... "
2. Make the matching pattern of each letter more aware of the special features of each letter, i.e. ignore the bits the I and T have in common, focus on the difference. Not sure how to do this when 3 (or more) letters match the same thing... I would probably just try a dodgy hack and see where I get to.
3. Neural networks does 2 much better than I can ever hope to.
I've done some more work on this now, and I will shortly follow it up with a chapter II! :)
I was a bit late in typing up this part I -- and I needed a Saturday to find the time (as well as the need to prepare 150 slides on Semantic Web Services for Monday made it easy to want to do other things...)
PS: Leo was convinced I was wasting my time (BUT IT'S THE JOURNEY NOT THE DESTINATION!), and tried OmniPage Pro on some dilbert cartoons, and it sucked! Ha! This isn't completely pointless after all!
[ 2 comments ] ( 86 views ) | permalink |




( 3 / 4077 )Back Next



