Skip to content

Welcome to PatCit

Building a comprehensive dataset of patent citations

Patents are at the crossroads of many innovation nodes: science, industry, products, competition, etc. Such interactions can be identified through citations in a broad sense.

It is now common to use front-page patent citations to study some aspects of the innovation system. Good news, there is much more buried in the front-page Non Patent Literature (NPL) citations and in the patent text itself1. patCit extracts and structures these citations.

Subscribe to our mailing list

Getting started

๐Ÿ›ข๏ธ Exploring the universe of patent citations has never been easier. No more complicated data set-up, memory issue and queries running for ever, we host patCit on BigQuery for you. You can also download it on Zenodo.

๐Ÿ‘ฉโ€๐Ÿ”ฌ Time to play! We give public access to quickstart notebooks.

๐Ÿค— patCit is community driven and benefits from the suppport of a reactive team who is eager happy to help and tackle your next request. This is where academics and industry practitioners meet.

๐Ÿ”ฎ patCit is based on state-of-the-art open source projects and libraries such as grobid/biblio-glutton and spaCy. Even better, patCit is continuously improving with the rest of its ecosystem.

๐ŸŽ“ Want to know more? Read patCit academic presentation or dive into usage and technical guides on patCit documentation website.


  1. Front page NPL citations contain bibliographical references, office actions, search reports, patents, webpages, wikis, norms & standards, product documentations, databases and litigations. Patent text notably contain citations of patents, NPL, software, databases and products.