Warctools

From COPTR
Jump to navigation Jump to search
Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
Homepage:https://pypi.python.org/pypi/warctools/
Source Code:https://github.com/internetarchive/warctools/
License:MIT License
Platforms:Cross-platform
Language:Python
Input Formats:WARC, ARC (Internet Archive)
Output Formats:WARC
Function:File Format Migration,Metadata Extraction,Validation
Content type:Web



Description[edit]

This is the most current and well-maintained Python codebase for working with WARC files. It provides a number of command-line tools for common WARC/ARC operations, and can also act as a library to create or work with WARC files directly from Python.

Pull requests and releases are currently managed by Thomas Figg, who can be contacted via Twitter.

Older Python WARC Implementations[edit]

This codebase was initially funded by IIPC and developed by Hanzo Archives. This lead to the hanzo-warc-tools package and source code.

There is also a separate warc package that was created by the Internet Archive (see source code), but is no longer in use.

Both of these projects are defunct and are now superseded by the internetarchive/warctools project.

User Experiences[edit]

Development Activity[edit]

Releases[edit]

2025-05-30 17:17:16
[tag:github.com,2008:Repository/8960735/5.0.0 5.0.0]
by mistydemeo
2016-09-01 22:39:45
[tag:github.com,2008:Repository/8960735/4.10.0 4.10.0]
by nlevitt
2012-11-29 13:31:13
[tag:github.com,2008:Repository/8960735/4.15-rc1 4.15-rc1]
by lekash
2012-09-14 15:18:43
[tag:github.com,2008:Repository/8960735/build_success-2012-09-14T16-25-56.483325901 build_success-2012-09-14T16-25-56.483325901]
by SteveJones
2012-09-14 13:27:40
[tag:github.com,2008:Repository/8960735/build_success-2012-09-14T15-24-42.616660024 build_success-2012-09-14T15-24-42.616660024]
by SteveJones

Development[edit]

2025-05-30 18:01:33
[tag:github.com,2008:Grit::Commit/21db132fd3e4b4042cd011d9dc3fb30276a5a0b6 config: migrate to pyproject.toml]
by mistydemeo https://github.com/mistydemeo
2025-05-30 17:20:41
[tag:github.com,2008:Grit::Commit/4c17416597117dd50205d3273fc3342a0c511353 gitignore: add dist]
by mistydemeo https://github.com/mistydemeo
2025-05-30 17:17:16
[tag:github.com,2008:Grit::Commit/df88bcd7ef880c64a729e57e554c65566459b711 release: 5.0.0]
by mistydemeo https://github.com/mistydemeo
2025-05-30 17:16:46
[tag:github.com,2008:Grit::Commit/2722ae8888f99be044ee26c34f6f6b16f60cd6a6 release: 5.0.0]
by mistydemeo https://github.com/mistydemeo
2025-05-30 17:16:46
[tag:github.com,2008:Grit::Commit/974120236a3776c0fbdfec1bbbc186ac9d7f6dc3 Add a gitignore]
by mistydemeo https://github.com/mistydemeo