#openmp dataset

4 messages · Page 1 of 1 (latest)

half star
#

Hey, I scraped the open.mp documentation and processed it with AI, creating a dataset of around 3,600 lines. As a trial, I fine-tuned the gpt-oss-120b model, and the results turned out quite good for a model of this size.

Next, I’m planning to expand the dataset by processing well-structured libraries, game modes, filterscripts, and snippets in the same way. Clean and well-organized projects in particular should significantly improve the overall dataset quality.

If you know any game modes or filterscripts with solid, reliable code architecture, feel free to share them here. I’m aiming to make the dataset more robust and diverse.
Dataset: https://huggingface.co/datasets/yeatdev/openmp-dataset

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

manic kelp
#

Do you think it would be a good idea to add to the dataset information about other programming languages that are used in various gamemodes - such as Go, C#, Python, TypeScript, Rust, etc.?

half star
half star
#

The dataset has now been updated on Hugging Face

Sources used:

After analyzing 330 files, I generated around 990 high-quality Q&A pairs.

DeepSeek-V4-Pro was used throughout the analysis and dataset refinement process.

GitHub

A PvP SA:MP survival gamemode. The aim of the game is to find supplies such as tools or weapons to help you survive, either alone or in a group. - Southclaws/ScavengeSurvive

GitHub

A kind of zombie server for SA-MP. The overall concept is to strengthen the survival elements and push the limits of the PAWN language. - gurkansahinn/lpz-gamemode-samp