There was some problems with the files for 2022 which have been corrected.
There are two files available for 2022:
https://data.jobtechdev.se/annonser/historiska/index.html
2022.jsonl.zip - This is a file with one ad in json format per line. When you load it programatically you don’t have to load all ads, just read one line at the time and process the ad. The memory usage will be much lower and the time until the first ad is loaded is greatly reduced.
2022.zip - this is a json file, which we provide for legacy reasons.
The content is the same, just different file formats
The intermediate files for 2022 will be deleted.
Code example for Python on how to use jsonlines files:
# pip install jsonlines
import jsonlines
if __name__ == '__main__':
with jsonlines.open("2022.jsonl") as infile:
for ad in infile:
# do what you need to do with the ad
pass