Root NationArticlesWeaponsHow 19-Year-Old Ukrainian Is Developing Computer Vision System for FPV Drones

How 19-Year-Old Ukrainian Is Developing Computer Vision System for FPV Drones

-

© ROOT-NATION.com - Use of content is permitted with a backlink.

This material was provided through the Startup Support Program. The editorial team is not responsible for the accuracy of the facts or data presented in the article. All inquiries should be directed to the author and developer.

Just a few weeks ago, 19-year-old Ukrainian Oleksandr Shynkarenko knew almost nothing about machine learning. Today, he leads an AI project capable of identifying military targets in drone footage with 79% accuracy. The developer shared the story of his project, Voron (“Raven”), with Root-Nation. It is a computer vision system built in just 48 hours on a zero budget. Shynkarenko relied solely on Google Colab, open-source tools, and artificial intelligence – using AI to teach himself how to build AI.

Ворон

“Two weeks ago, I saw a video showing neural networks tracking targets from FPV drone footage. It was fascinating – and it raised a question: if others can do it, why can’t Ukraine have its own solutions?” says Oleksandr Shynkarenko. “At that moment, my experience in ML and computer vision was practically zero. Everything I knew about neural networks came from a few articles and YouTube videos. But I decided that the best way to learn was to start building.”

The developer collected video materials, extracted frames, and manually annotated targets. Some footage he recorded himself, while the rest came from open sources. This became his first experience in creating a dataset for computer vision.

Two weeks of painstaking work

He used CVAT (Computer Vision Annotation Tool) to annotate the data – an open platform for labeling objects in images and video. The annotation process proved more time-consuming and complex than anticipated: each frame required careful review, drawing a bounding box around a target and assigning the correct class. The author defined five categories:

  • Person
  • Armored vehicle
  • Light military vehicle
  • Drone
  • Vehicle (civilian and military)

Ворон

Data annotation took two weeks, but this demanding process became the foundation of the entire project, as model performance directly depends on dataset quality. “I reviewed every frame twice, correcting errors and refining object boundaries. The biggest challenge was doing it all for the first time – with no prior experience and no mentor,” Oleksandr admits. “I had to find answers myself: how should partially obscured objects be labeled? What to do with blurred frames? How to handle small, distant objects on the horizon?” After two weeks of work, he had a complete dataset ready for experimentation.

Once the data was prepared, Oleksandr set himself a challenge – to build a working model from scratch in just 48 hours. It was a deliberate test of focus and endurance: minimal sleep, minimal breaks, complete dedication to a single goal. He defined clear parameters – 48 hours to train two models of different complexity and compare their performance.

Technology stack

The developer worked with Python 3, using OpenCV for image processing and video preparation. For the model itself, he selected Ultralytics YOLO, an open-source framework for object detection tasks. YOLO (You Only Look Once) is a family of neural network architectures designed for real-time object recognition.

For comparison, two architectures were selected:

  • YOLOv8s (Small) – used for the initial v0.1 prototype, a lightweight model optimized for maximum speed.
  • YOLOv8m (Medium) – a more powerful v0.2 model expected to deliver higher detection accuracy.

Ворон

The training was conducted in Google Colab using a T4 GPU. This free yet powerful environment allows for rapid experimentation without the need for expensive personal hardware. The author considers it a lifeline for students or self-taught developers who don’t have access to costly servers.

Model Training: Facts and Figures

“I trained both models for 100 epochs. An epoch is one full pass through the entire dataset during training,” says Oleksandr. “More epochs allow the model to ‘learn’ the data better but also increase the risk of overfitting.”

Ворон

mAP (mean Average Precision) is a standard metric used to evaluate object detection models. It shows how accurately the model detects objects and how precisely it defines their boundaries. The v0.2 model achieved an mAP@50 score of 0.794, meaning it correctly identifies about 80% of objects with acceptable precision. For a model built in just two days by a self-taught developer with no prior experience, this is a remarkably solid result.

Differences Between v0.1 and v0.2

The main difference between the two models is the amount of training data. The v0.1 model was trained on 2,100 images and served as a quick proof of concept, with a training time of just 40 minutes. In contrast, v0.2 had access to 13,890 images, allowing it to learn to recognize objects under a wider range of conditions, including varying flight altitudes, times of day, weather conditions, and viewing angles. Training v0.2 took over 7 hours.

The difference is also clear in how the models “see” objects: v0.1 frequently misses targets, overlooks objects, or misclassifies them.

Ворон

However, v0.2 consistently detects objects in video footage, even when they are partially obscured or set against complex backgrounds.

Ворон

The developer notes that he did not write code from scratch but managed all other aspects of the project: preparing and annotating frames, configuring YOLO, selecting optimal training hyperparameters, training the models, analyzing metrics and choosing architectures, and testing the model on real video. Oleksandr has no formal IT education, relying primarily on AI assistants as his source of knowledge.

“In two weeks, I went from being a complete beginner to someone capable of independently training a computer vision model with solid metrics,” he said. “This demonstrates that the entry barrier to ML and computer vision today is much lower than it seems.”

Testing and Current Limitations

So far, all tests have been conducted on recorded video rather than in real time. The model has not yet been integrated into a drone control system and has not been tested in operational conditions. The goal of this stage is to demonstrate that Ukrainian developers, even without significant resources, can create such solutions. This serves as a proof of concept, showcasing capabilities and marking the first step toward a more advanced system.

Processing speed on a T4 GPU is approximately 38 FPS for both models. While this is fast enough for real-time applications, actual performance in the field may be lower due to additional system delays, camera signal processing, data transmission, and other factors.

Why “Voron”?

The name carries symbolic meaning. Oleksandr’s company is called Svarog, after the ancient Slavic god of blacksmithing, and he wanted all projects to have Ukrainian names connected to the country’s culture and history. In Slavic mythology, the raven (voron) represents wisdom, foresight, and the ability to see what is hidden from others. Ravens have keen eyesight and can notice even the smallest details.

Additionally, in many cultures, ravens are associated with battlefields, guardianship, and guidance. For this reason, Voron is considered an ideal name for a computer vision system designed to help Ukrainian military personnel see more clearly, quickly, and accurately.

Ethics and Responsibility

Developing technology for military applications inevitably raises ethical considerations. The developer has drawn several conclusions for himself. First, the model will not be open-source, and any sharing will occur only under NDA (non-disclosure agreements). Before any real-world implementation, he will focus on protective systems designed to maximize civilian safety. The model is intended to recognize only military targets and avoid false detections of civilian vehicles, buildings, or people in non-military attire. The technology will be provided exclusively to Ukrainian military organizations after thorough verification.

Ворон

Oleksandr’s goal is to understand how AI can practically assist military personnel. Drone operators work in demanding conditions – often at night or in poor weather – where attention can fatigue and targets may be missed. AI does not tire and can analyze each frame with consistent attention, whether it’s the first minute or the fifth hour. Teaching a drone to see more clearly, faster, and more reliably could reduce missed targets and, in turn, help protect the lives of both service members and civilians.

v0.3 “Night Vision”

Voron demonstrates that technological innovation in Ukraine can develop quickly and efficiently. The next stage is v0.3, a model capable of detecting objects in the thermal spectrum. Thermal cameras allow targets to be seen in complete darkness, through smoke, fog, or some vegetation.

Updates on Voron’s development can be followed on Oleksandr’s Telegram channel via this link.

Read also:

Svitlana Anisimova
Svitlana Anisimova
I'm addicted to books and stationery, and love everything with flour, sugar, and the hate-to-love trope. Have a lot of guilty pleasures for one girl, and don’t feel guilty about it.
Subscribe
Notify of
guest

0 Comments
Newest
OldestMost Voted