Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Common Crawl Foundation

Team
non-profit
Verified
https://commoncrawl.org
commoncrawl
commoncrawl
Activity Feed

AI & ML interests

Crawled data and metadata

Recent Activity

tvaughan  updated a dataset 5 days ago
commoncrawl/statistics
greglindahl  published a dataset 6 days ago
commoncrawl/host-index-testing-v2
malteos  updated a Space 8 days ago
commoncrawl/cc-citations
View all activity

Thom Vaughan's profile picture Pedro Ortiz Suarez's profile picture Paul Lazar's profile picture Greg Lindahl's profile picture Ford H's profile picture Jen English's profile picture Sebastian Nagel's profile picture Laurie Burchell's profile picture Hande Celikkanat's profile picture malteos's profile picture Thijs Dalhuijsen's profile picture Luca's profile picture Catherine Arnett's profile picture

commoncrawl 's datasets 7

commoncrawl/statistics

Viewer • Updated 5 days ago • 611k • 241 • 26

commoncrawl/CommonLID

Viewer • Updated 18 days ago • 373k • 347 • 42

commoncrawl/gneissweb-annotation-host-testing-v1

Viewer • Updated Dec 11, 2025 • 617M • 176

commoncrawl/gneissweb-annotation-url-testing-v1

Viewer • Updated Dec 10, 2025 • 11.5B • 2.82k

commoncrawl/host-index-testing-v2

Preview • Updated Nov 10, 2025 • 3

commoncrawl/citations

Viewer • Updated Oct 16, 2025 • 9.18k • 61 • 1

commoncrawl/eot2024_hostlevel_logs

Viewer • Updated Oct 9, 2024 • 271k • 27 • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs