I have quite an interest in Artificial Intellegance as well as machine learning. As a test to see the costs for such a personal project, i thought i would put up a project on here, and if costs are acceptable i may go ahead with the project.
The concept is quite simple, although i am unsure of if the application would be quite as simple. The project may even be broken up into parts, i.e not being just one program but multiple ones depending on costs and if it would work.
Concept is to have an AI bot, that hopefully allows speach recognition like Home Automated Livings HAL product, so you can speak to the bot (in either text or speech), and it would spead back (in either text or speech), but it would have some learning ability. The bot could learn what you tell it, as well as get facts when you ask it questions. So that is the bot part. It would be nice to be able to p2p it a little bit so that it can share information with other Bots, and collaborate somehow. I have a linux server so it may even be torrent trackers or something, as long as it can get through NAT somehow?
Part 2, is the knowledge learning crawler. Basically i would like to come up with a big database (i know it will be big, it will practically be a data mine). That consists of lots of information from sites like [url removed, login to view], OpenCyc ([url removed, login to view]) as well as any other human knowledge bases you can think of. The crawler would crawl through the web, indexing and crawling constantally and extracting data and information from each page and putting it into the database, thus expanding the knowledge base more and more as time goes on. But also the bots would interface with the database putting their information in it, and getting information from it. So if you told the bot a fact, it is recorded with the bots id, so that if you want that fact to be a private fact, only that bot can access, or use it. Whilst if its public it can be accessed by any of the bots.
The bots could even do distrubuted crawling maybe? i don't know thats just an idea. The crawlers would be on linux, and the bot on windows like HAL. You can use any open source solution you want, do it however you want. I don't expect you to re-invent the wheel. As long as it can learn, build on knowledge and use that collected knowledge. As well as draw infrences thats great.
Give me your bids, remember its a personal project though, not a million dollar budget project :p.
Crawler could be a distributed architecture, with a central mysql database. but has to still be able to function with one machine, or multiple machines.