Extracting current language use from the web
Harvesting the web for linguistic purposes has only begun. With functionalities like "define:" in Google we may be entering the era when things are looked up on the web rather than in dictionaries or encyclopedias. We present examples of how dynamic language resources can be created making use of the potential of web search engines. WebCorp and KWiCFind make the web available for creating ad-hoc concordances. They are excellent tools for batch mode use but not otherwise because it takes time to obtain a concordance. Lexware Web Concordance is designed specifically for interactive online use. It is used in a similar way as a search engine is used in a browser. One enters a word or phrase, presses the 'Search' button and gets a concordance in similar time as it takes for Google to respond on a query. The concordance obtained thus shows the use of the searched word or phrase on the web in a KWiC format. Lexware Web Concordance is mounted on Google and it is quick because it uses Google snippets as the basis for concordances. All of the concordance lines are presented with links to the sites from which they come, a user may thus check occurrences of the word or phrase also on the source page. We present examples of how the system can be used by language learners and linguists.
Home | Abstracts