Monthly Archives for August 2013

Entity Extraction – URL

  For this entity extraction task my goal is to write a simple regex rule  to identify the most common URLs from the text documents. Example: http://shakthydoss.com , https://support.company.com , http://172.16.7.41/home/ , http://172.6.7.41/home?name=shakthydoss&year=2013    As said earlier I took time to understand the structure, that URL is composed of. Every URL consists of the following units: The schema name (commonly called […]