Quest for next-generation search engine
A computer grid designed to analyse data that will be generated by the world’s biggest scientific experiment is being used by two high-tech companies to help build the next generation of an internet search engine.
The Cambridge start-up firms — Imense and iLexIR — have created a joint venture called Camtology to link their individual expertise and products together to enable searching of both text and images online.
Their aim is to provide a British-based search engine capable of competing with the best providers on the world stage, capturing a large share of the huge market for search services.
Imense is already building the next generation of image searching, developing innovative, more powerful solutions that make retrieval of images easier than with any existing system on the internet.
iLexIR focuses on natural language processing, aimed at identifying relevant information, as opposed to just looking at individual words in a document.
The Camtology team is using GridPP to test and enhance their software. Funded by the Britain's Science & Technology Facilities Council (STFC), this computer grid was built to handle and analyse Britain’s share of the petabytes of data generated annually by the Large Hadron Collider project at the European Organisation for Nuclear Research (CERN) in Switzerland, requiring huge data storage and processing capabilities. (One petabyte is one quadrillion bytes.)
With the aim of becoming 'the Google of image searching', Imense has developed a search engine that will make sense of the huge numbers of pictures on the web.
Although images and video make up more than 70% of digital data available on the internet, traditional software cannot index this information directly, relying totally on text descriptions entered by hand.
Imense’s software can look at a photo and recognise the colours, shapes, objects and scenes and retrieve images based on their content, without the need for human-generated captions.
It also uses a query language — the user just types in a few key words and the software can interpret the request and match it to relevant images on the basis of their visual content.
The use of the Grid and its vast processing power has enabled Imense to test and demonstrate its software on sufficiently large numbers of photos — millions upon millions of images — that otherwise would have been impossible.
iLexIR is focusing on natural language processing. Current search engines present pages of results in order of expected relevance to a query, based on key words typed in by the user, usually resulting in vast numbers of irrelevant pages being returned and often with some important results not presented.
The use of natural language can help with both interpreting the query and also, crucially, with interpreting the pages with the potential answers.
Unlocking next-gen chip efficiency
By studying how heat moves through ultra-thin metal layers, researchers have provided a...
Ancient, 3D paper art helps shape modern wireless tech
Researchers have used ancient 3D paper art, known as kirigami, to create tuneable radio antennas...
Hidden semiconductor activity spotted by researchers
Researchers have discovered that the material that a semiconductor chip device is built on,...