Bellingcat has also published a repository of open source notebooks which you can find on our GitHub here The number of open source tools out there is growing rapidly, but technical bars to entry mean they remain inaccessible to many researchers. GitHub, a platform where developers share and discuss their code, is home to many of these tools. Searching the website for open source investigation tools can appear daunting to the uninitiated — there are more than six thousand results. Beyond this, many more of the platform’s over 300 million other projects, from social media scrapers to AI models, also have a useful application in open source research. But even many experienced researchers don’t use these tools. A 2022 survey by Bellingcat found that 45 per cent of researchers can’t use these tools, and in total 75 per cent have never used them. The core issue is accessibility: most tools are code scripts and command line interfaces. There’s no user interface to install, no web page to go to. While we encourage researchers to learn the command line and also teach it at our workshops, some tools require setup, debugging, and coding knowledge that limits who can effectively use them. If you are part of that 45 percent or that 75 percent, there is a way to unlock this world of open source tools for your own research — code Notebooks! These are widely known as Jupyter Notebooks.
Notebooks started off as a scientific tool, and they are still mostly used for Data Science and AI projects, however their application can be much broader. Simply put, they are files in the .ipynb format where you can store and test code. They allow you to run Python, a coding language known for its simplicity. This is important for our purposes as it’s also the most popular language for open source research tools. They are run through interactive coding environments composed of sequential blocks (or cells) where each cell contains a piece of code or documentation about it. Usually a Notebook is accessed via a specific application on your computer where you can store and test code. Under the hood, the Notebook connects to a computer or server, its running environment. This can be your personal computer, if you set it up accordingly. But there’s a much easier way to familiarise yourself with Notebooks and that’s using online services that can read, display, and run themNotebooks. Some of these have a more accessible user interface and, crucially, require less knowledge of the command line. The most prominent of these is Google Colab, a browser tool which displays Notebook files no differently than a normal Google Doc or Sheets document with an accessible interface to match. They can be organised in your Drive, shared with others, edited by multiple people and, most importantly, executed safely like in a virtual machine. For example, the cell in the screenshot below from Google Colab contains a simple Python code. Pressing the play button in the top left-hand corner tests the code. Below the cell you can see the result of the code as executed in a remote server hosted by the Notebook service you’re using.
There are also several alternative platforms like Kaggle or Binder. By using them, all you need to run a Notebook is the Notebook file itself (a file ending in .ipynb), while the running environment is hosted on a remote server.
Notebooks can hugely simplify experimentation with coding tools, scripts and data analysis. They offer the following advantages:
You can explore our sample notebook on Google Colab and run the code in its cells one at a time. Each cell is a self-contained block that instructs the running environment to perform an action.
Cells can be executed many times and in any order, but ideally a Notebook should be consistent. What you do with code in cells has a cumulative effect on the virtual environment which tests it. For example, if you run a cell that installs a tool needed to download YouTube videos (like yt-dlp), the next cell can download a video, but not if you skip the installation cell.
In the sample Notebook linked to above, we show you how to run a simple Python program, how to provide input so the Notebook reacts to your needs. You’ll learn the difference between Python and the command line and how to download a file to your notebook and then to your own device.
In our GitHub repository, you will find an updated list with Notebooks that help you run both Bellingcat and community tools. One example is Bellingcat’s telegram-phone-number-checker, a command line tool to find Telegram accounts from a phone number. While that tool is relatively simple to install on your computer, you will need to check if it is compatible with your version of Python or your operating system. Once again, the advantage of using our Notebook in Google Colab is that you don’t need to worry about all that, all you need are valid Telegram API keys you can get online. Our hope is that this repository expands to include more Notebooks for useful tools and methods that lack a visual interface. In fact, we encourage you to give us feedback on GitHub if there’s a popular tool you’d like to see covered and also to list your own Notebooks so others in the open source investigations community can benefit from them. Much like the open source tools themselves, Notebooks may be new to many researchers. However, they provide so much out of the box convenience that learning to use them is time well-spent — the tools you’ll be able to use may just unlock your next investigation.
Bellingcat is a non-profit and the ability to carry out our work is dependent on the kind support of individual donors. If you would like to support our work, you can do so here. You can also subscribe to our Patreon channel here. Subscribe to our Newsletter and follow us on Instagram here, X here and Mastodon here.