Main > Software Development > Components & Libraries >

Jericho HTML Parser 2.4


Jericho HTML Parser 2.4

Sponsored Links

Jericho HTML Parser 2.4 Ranking & Summary

RankingClick at the star to rank
Ranking Level
User Review: 0 (0 times)
File size: 1.36 MB
Platform: Windows All
License: GPL
Downloads: 862
Date added: 2007-08-23
Publisher: Martin Jericho

Jericho HTML Parser 2.4 description

A simple but powerful java HTML parser library allowing analysis and manipulation of parts of an HTML document This includes some common server-side tags and the fact that the analysis and the manipulation processes are reproducing verbatim any unrecognised or invalid HTML.
Jericho HTML Parser also provides high-level HTML form manipulation functions.
Jericho HTML Parser is released under both the GNU Lesser General Public License (LGPL) and the Eclipse Public License (EPL). You are therefore free to use it in commercial applications subject to the terms detailed in either one of these licence documents.
The javadocs provide comprehensive documentation of the entire API, as well as being a very useful reference on aspects of HTML and XML in general.
Main features:
- The presence of badly formatted HTML does not interfere with the parsing of the rest of the document, which makes the library ideal for use with "real-world" HTML that chokes other parsers.
- ASP, JSP, PSP, PHP and Mason server tags are explicitly recognised by the parser. This means that normal HTML is still parsed properly even if there are server tags inside them, which is common for example when dynamically setting element attributes.
- It is neither an event nor tree based parser, but rather uses a combination of simple text search, efficient tag recognition and a tag position cache. The text of the whole source document is first loaded into memory, and then only the relevant segments searched for the relevant characters of each search operation.
- Compared to a tree based parser such as DOM, the memory and resource requirements can be far better if only small sections of the document need to be parsed or modified. Incorrect or badly formatted HTML can easily be ignored, unlike tree based parsers which must identify every node in the document from top to bottom.
- Compared to an event based parser such as SAX, the interface is on a much higher level and more intuitive, and a tree representation of the document element hierarchy is easily created if required.
- The begin and end positions in the source document of all parsed segments are accessible, allowing modification of only selected segments of the document without having to reconstruct the entire document from a tree.
- The row and column number of each position in the source document is easily accessible.
- Provides a simple but comprehensive interface for the analysis and manipulation of HTML form controls, including the extraction and population of initial values, and conversion to read-only or data display modes. Analysis of the form controls also allows data received from the form to be stored and presented in an appropriate manner.
- Custom tag types can be easily defined and registered for recognition by the parser.
- Built-in functionality to format HTML source code that indents elements according to their depth in the document element hierarchy.
- Built-in functionality to render HTML markup with simple text formatting.
- Built-in functionality to extract all text from HTML markup, suitable for feeding into a text search engine such as Apache Lucene.

Jericho HTML Parser 2.4 Screenshot

Jericho HTML Parser 2.4 Keywords

Bookmark Jericho HTML Parser 2.4

Hyperlink code:
Link for forum:

Jericho HTML Parser 2.4 Copyright do not provide cracks, serial numbers etc for Jericho HTML Parser 2.4. Any sharing links from, or are also prohibited.

Allok Video Splitter 2.2.0 Review:

Name (Required)
Featured Software

Want to place your software product here?
Please contact us for consideration.

Related Software
Will read and analyze HTML files Free Download
A Java library used to parse HTML in either a linear or nested fashion Free Download
eConn Virtcert Parser. - Download inventory into CSV file at your desktop. It can extract the data and store in database. It takes inputs in form of doc, docx, html, rtf, and text format Free Download
Takes naked HTML and parses it into a VBScript Response. Version 2 adds the ability to retain your HTML formatting or compress it into a single line for smaller page sizes. Also added install and unin Free Download
Library for parsing and manipulating real world malformed HTML Free Download
It will convert your Favorites directory into an HTML document which you can give to others or use as a start page. The program can also create html catalogues of any folder of files, such as Music Collections or Photos. Free Download
The ICE XML SAX/DOM Parser is a native VCL Component for Borlands C++ Builder and Delphi. The parser supports ‘event parsing’ accordance with SAX and also the W3C DOM (Document Object Model) Free Download
Converter pdf to html documents. Free Download