Main > Free Download Search >

Free html parser software for windows

html parser

Sponsored Links
Sponsored Links
Secleted [ 0 ] software to compare
Results 1 - 15 of about 6241
HTML Parser 1.6

HTML Parser 1.6


A Java library used to parse HTML in either a linear or nested fashion more>>
A Java library used to parse HTML in either a linear or nested fashion Primarily used for transformation or extraction, it features filters, custom tags, visitors, and easy to use JavaBeans. HTML Parser is a robust, fast, and well tested package.
Welcome to the homepage of HTMLParser - a super-fast real-time parser for real-world HTML. What has attracted most developers to HTMLParser has been its simplicity in design, speed and ability to handle streaming real-world html.
The two fundamental use-cases that are handled by the parser are extraction and transformation (the syntheses use-case, where HTML pages are created from scratch, is better handled by other tools closer to the source of data).
In general, to use the HTMLParser you will need to be able to write code in the Java programming language. Although some example programs are provided that may be useful as they stand, its more than likely you will need (or want) to create your own programs or modify the ones provided to match your intended application.
To use the library, you will need to add either the htmllexer.jar or htmlparser.jar to your classpath when compiling and running. The htmllexer.jar provides low level access to generic string, remark and tag nodes on the page in a linear, flat, sequential manner.
The htmlparser.jar, which includes the classes found in htmllexer.jar, provides access to a page as a sequence of nested differentiated tags containing string, remark and other tag nodes.
Extraction
Extraction encompasses all the information retrieval programs that are not meant to preserve the source page.
This covers uses like:
- text extraction, for use as input for text search engine databases for example
- link extraction, for crawling through web pages or harvesting email addresses
- screen scraping, for programmatic data input from web pages
- resource extraction, collecting images or sound
- a browser front end, the preliminary stage of page display
- link checking, ensuring links are valid
- site monitoring, checking for page differences beyond simplistic diffs
There are several facilities in the HTMLParser codebase to help with extraction, including filters, visitors and JavaBeans.
Transformation
Transformation includes all processing where the input and the output are HTML pages.
Some examples are:
- URL rewriting, modifying some or all links on a page
- site capture, moving content from the web to local disk
- censorship, removing offending words and phrases from pages
- HTML cleanup, correcting erroneous pages
- ad removal, excising URLs referencing advertising
- conversion to XML, moving existing web pages to XML
During or after reading in a page, operations on the nodes can accomplish many transformation tasks "in place", which can then be output with the toHtml() method. Depending on the purpose of your application, you will probably want to look into node decorators, visitors, or custom tags in conjunction with the PrototypicalNodeFactory.
<<less
Download (4.14MB)
Added: 2007-08-24 License: GPL Price:
511 downloads
 
Other version of HTML Parser
HTML Parser 1.0ScalingWeb - This software was created to help you parse any HTML you might need while working. HTML Parser. This software was created to help you parse any HTML you
License:Freeware
Download (12.4KB)
766 downloads
Added: 2007-10-25
WER HTML Parser 1.01

WER HTML Parser 1.01


Will read and analyze HTML files more>> Will read and analyze HTML files

The WER HTML Parser application was designed to read and analyze HTML files, and represent them in a tree-view form, where each node is a HTML element, and embedded elements are displayed as children of their parent HTML element. Very useful if you want to check your site for errors, or if you want to understand the structure of a page.

<<less
Download (244KB)
Added: 2008-09-30 License: Freeware Price: FREE
391 downloads
Jericho HTML Parser 2.4

Jericho HTML Parser 2.4


A simple but powerful java HTML parser library allowing analysis and manipulation of parts of an HTML document more>>
A simple but powerful java HTML parser library allowing analysis and manipulation of parts of an HTML document This includes some common server-side tags and the fact that the analysis and the manipulation processes are reproducing verbatim any unrecognised or invalid HTML.
Jericho HTML Parser also provides high-level HTML form manipulation functions.
Jericho HTML Parser is released under both the GNU Lesser General Public License (LGPL) and the Eclipse Public License (EPL). You are therefore free to use it in commercial applications subject to the terms detailed in either one of these licence documents.
The javadocs provide comprehensive documentation of the entire API, as well as being a very useful reference on aspects of HTML and XML in general.
Main features:
- The presence of badly formatted HTML does not interfere with the parsing of the rest of the document, which makes the library ideal for use with "real-world" HTML that chokes other parsers.
- ASP, JSP, PSP, PHP and Mason server tags are explicitly recognised by the parser. This means that normal HTML is still parsed properly even if there are server tags inside them, which is common for example when dynamically setting element attributes.
- It is neither an event nor tree based parser, but rather uses a combination of simple text search, efficient tag recognition and a tag position cache. The text of the whole source document is first loaded into memory, and then only the relevant segments searched for the relevant characters of each search operation.
- Compared to a tree based parser such as DOM, the memory and resource requirements can be far better if only small sections of the document need to be parsed or modified. Incorrect or badly formatted HTML can easily be ignored, unlike tree based parsers which must identify every node in the document from top to bottom.
- Compared to an event based parser such as SAX, the interface is on a much higher level and more intuitive, and a tree representation of the document element hierarchy is easily created if required.
- The begin and end positions in the source document of all parsed segments are accessible, allowing modification of only selected segments of the document without having to reconstruct the entire document from a tree.
- The row and column number of each position in the source document is easily accessible.
- Provides a simple but comprehensive interface for the analysis and manipulation of HTML form controls, including the extraction and population of initial values, and conversion to read-only or data display modes. Analysis of the form controls also allows data received from the form to be stored and presented in an appropriate manner.
- Custom tag types can be easily defined and registered for recognition by the parser.
- Built-in functionality to format HTML source code that indents elements according to their depth in the document element hierarchy.
- Built-in functionality to render HTML markup with simple text formatting.
- Built-in functionality to extract all text from HTML markup, suitable for feeding into a text search engine such as Apache Lucene.
<<less
Download (1.36MB)
Added: 2007-08-23 License: GPL Price:
800 downloads
Nathan Levecks HTML Parser 2

Nathan Levecks HTML Parser 2


Takes naked HTML and parses it into a VBScript Response. Version 2 adds the ability to retain your HTML formatting or compress it into a single line for smaller page sizes. Also added install and unin more>>
Takes naked HTML and parses it into a VBScript Response. Version 2 adds the ability to retain your HTML formatting or compress it into a single line for smaller page sizes. Also added install and uninstall support. Updated links.
<<less
Download (1514K)
Added: 2000-12-04 License: Freeware Price:
3248 downloads
 
Other version of Nathan Levecks HTML Parser
Nathan Levecks HTML Parser (Perl Edition) 2Takes naked HTML and parses it into a Perl script. Can optionally include Shebang line and HTTP header. Version
License:Freeware
Download (1515K)
3268 downloads
Added: 2000-12-04
HTML Movie Parser 1

HTML Movie Parser 1


HTML Movie Parser retrieve movie to save it for you. more>> HTML Movie Parser retrieve movie to save it for you. By sifting through the source code of a Web page this application will identify code that applies to a video link. From there it will isolate the link and open it up therefore giving you the opportunity to save the video at any file location. It works great if you frequently visit message boards and would like to save videos without having to sift through your Internet cache.<<less
Download (151KB)
Added: 2008-08-17 License: Freeware Price: Free
507 downloads
 
Other version of HTML Movie Parser
HTML Movie Parser 1.0HTML Movie Parser is a free tool that can retrieve a movie and save it for you. HTML Movie Parser is a free tool that can retrieve a movie and save it for you. By sifting through
License:Freeware
Download (151KB)
1155 downloads
Added: 2006-11-09
DTS Parser 2.0

DTS Parser 2.0


This tools is able to reconstruct any DTS file (even truncated) applying some modifications more>>
This tools is able to reconstruct any DTS file (even truncated) applying some modifications DTS Parser is a simple, easy-to-use, smart tool that is able to reconstruct any DTS file (even truncated) applying some modifications like:
- remove Dialog Normalisation
- remove CRC
Most of DTS formats can be recognized:
- DTS
- DTS ES Discrete
- DTS ES Matrix
- DTS 24/96
- 14 bits DTS from Audio CD (aka. DTSWAV)
<<less
Download (312KB)
Added: 2007-07-30 License: Freeware Price:
979 downloads
HTML Publisher 3.0

HTML Publisher 3.0


reate your own web sites. This powerful WYSIWYG editor is all anyone will eve... more>> reate your own web sites. This powerful WYSIWYG editor is all anyone will ever need to design a web site. Includes a wizard to help those with no HTML experience.<<less
Download (4.34MB)
Added: 2005-10-14 License: Freeware Price: Free
1471 downloads
MPEG Parser 1.3

MPEG Parser 1.3


MPEG Parser is the program for viewing of internal structure of MPEG-files more>>
MPEG Parser is a utility that shows you the structure of a MPEG file.
MPEG Parser is the viewer of MPEG-files internal structure. It can view fields of structures of what the MPEG file consists.
MPEG Parser shows next data about MPEG file: Resolution, Frame rate, Size, Aspect ratio, Bitrate ... Utility allows to correct errors in MPEG - use menu item "Save corrected..." for this. You can also save internal structure in the text file via menu item "Save MPEG information...".
<<less
Download (288KB)
Added: 2006-06-30 License: Freeware Price:
1216 downloads
 
Other version of MPEG Parser
MPEG Parser 1.0MPEG Parser is the program for viewing of internal structure of MPEG-files. MPEG Parser is the program for viewing of internal structure of MPEG-files. It can view fields
Price: $0.00
License:Freeware
Download (881KB)
1697 downloads
Added: 2005-03-02
LOTEC HTML Faser 1.0

LOTEC HTML Faser 1.0


HTML editor that comes wdynamic codes and a Java Script Gallery. more>>
LOTEC HTML Faser is a great little HTML editor that comes with dynamic codes and a Java Script Gallery with 6 super Java Scripts.
<<less
Download (1600k)
Added: 1999-09-11 License: Freeware Price: $0.00
3700 downloads
Parser 1.3

Parser 1.3


Parser v1.3 allows to use Addict 2.xx with RichViewEdit and DBRichViewEdit more>>
Parser v1.3 allows to use Addict 2.xx with DBRichViewEdit and RichViewEdit.
This version of parser and demo is also included in installation of Addict v2.4. It also skips Unicode text (and text with SYMBOL_CHARSET).
<<less
Download (11.6KB)
Added: 2006-11-26 License: Freeware Price:
1103 downloads
HTML Compress 5.5

HTML Compress 5.5


HTML Compress compresses HTML by removing unnecessary white space characters such as carriage returns, line feeds, spaces etc more>>
HTML Compress will compress HTML by removing unnecessary white space characters such as carriage returns, line feeds, spaces etc
It also has the capability to remove certain HTML tags. JavaScript, CSS and VBScript can also be subjected to similar compression using this program. Because of the extensible nature of HTML Compress, it can be configured to allow it to optimize other SGML based formats too - basically anything using angled brackets to define "tags". Unlike some compression software, HTML Compress does not remove any the terminating tag of a tag pair.
Compression of HTML allows it to download faster, display faster ( the parser does not have as much junk to deal with), take less space on web servers and end users system, decrease loads on a server.
HTML Compress is fully configurable not only allowing you to set options for individual tags, HTML Compress will allow you to set options depending on individual attributes through pattern matching - allowing higher compression rates.
<<less
Download (1.33MB)
Added: 2006-01-31 License: Freeware Price:
1549 downloads
 
Other version of HTML Compress
HTML Compress 4.0Because of the extensible nature of HTML Compress, it can be configured to allow it to compress ... Compression of HTML allows it to download faster, display faster ( the parser does not have as
Price: $0.00
License:Freeware
Download (550K)
2973 downloads
Added: 2001-09-22
HTML Shrinker 2.6

HTML Shrinker 2.6


shrinks your web pages more>> HTML Shrinker is a tool for reducing the size of html files. As a consequence your web site will load faster. HTML Shrinker removes all unnecessary bytes within html files. The look of the html page wont change after it is compressed.<<less
Download (1.3m)
Added: 2008-11-08 License: Freeware Price:
388 downloads
ShaniXmlParser 1.4.16

ShaniXmlParser 1.4.16


ShaniXmlParser is a small and fast Xml/Html DOM/SAX non validating parser written in Java more>>
ShaniXmlParser is a small and fast Xml/Html DOM/SAX non validating parser written in Java ShaniXmlParser can also be used as a parser for invalid xml files.
ShaniXmlParser uses the org.w3c.dom interfaces and the jaxp interfaces. ShaniXmlParser works also on mono/.net thanks to ikvm.
ShaniXmlParser can parse badly formed xml files, for example, it can parse files with inverted tag, bad escaped &,. ShaniXmlParser expands all entities (if doctype present or auto doctype is set).
There is a css parser included. A dtd parser is included aldo. Entities are decoded from the dtd if any. If no dtd and the attribute AUTO_DOCTYPE is set on the factory, then entity replacement will fallback to the internal entity set
(equals to the xhtml 1.0 entity set).
The dtd parser parse entity, element, attlist, notation. It generates regexp to
check the validity of the document.
The dom parser will go directly in html mode if any of the following is met :
- The root node is
- A w3c HTML/XHTML DTD is linked with the document
<<less
Download (3.29MB)
Added: 2007-07-20 License: GPL Price:
827 downloads
Pix Parser 1.0

Pix Parser 1.0


PixParser is a free program which can be used to parse Cisco Pix log files to find any search term specified more>>
PixParser is a free program which can be used to parse Cisco Pix log files to find any search term specified. It will build an HTML file, currently called report.html, with your search results presented.
It does a reverse lookup on all IPs, so that you dont have to manually figure out where people have gone via the Internet.
<<less
Download (309KB)
Added: 2006-12-22 License: Freeware Price:
1114 downloads
Mathparser 1.0

Mathparser 1.0


Mathparser is a .NET assembly which contains a math parser written in C# more>>
Mathparser is a .NET assembly which contains a math parser developed in C# and will evaluate a mathematical expression and returns a double value.
<<less
Download (74KB)
Added: 2006-10-03 License: Freeware Price:
1125 downloads
Secleted [ 0 ] software to compare
  • Page: 1 of 5
  • 1
  • 2
  • 3
  • 4
  • 5