dirtmaxim commited on
Commit
f24a160
·
verified ·
1 Parent(s): 2ef6c76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -27
README.md CHANGED
@@ -17,51 +17,56 @@ tags:
17
  - UAV
18
  - drone
19
  - video
20
- model_description: "Behavior recognition model for in situ drone videos of baboons, built using X3D model. It is trained on the BaboonLand mini-scene dataset, which is comprised of 20 hours of aerial video footage of baboons captured using a DJI Mavic 2S drone."
21
  ---
22
 
23
- # Model Card for X3D-KABR-Kinetics
24
- x3d-BaboonLand is a behavior recognition model for in situ drone videos of zbaboons,
25
- built using X3D model.
26
- It is trained on the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) dataset.
27
- It includes both spatiotemporal (i.e., mini-scenes) and behavior annotations provided by an expert
28
- behavioral ecologist.
29
 
30
  ## Model Details
31
 
32
  ### Model Description
33
 
34
  - **Developed by:** Isla Duporge, Maksim Kholiavchenko, Roi Harel, Scott Wolf, Daniel Rubenstein, Meg Crofoot, Tanya Berger-Wolf, Stephen Lee, Julie Barreau, Jenna Kline, Michelle Ramirez, Charles Stewart
35
-
36
  - **Model type:** X3D-L
37
  - **License:** MIT
38
  - **Fine-tuned from model:** [X3D-L](https://github.com/facebookresearch/SlowFast/blob/main/configs/Kinetics/X3D_L.yaml)
39
 
40
- This model was developed for the benefit of the community as an open-source product, thus we request that any derivative products are also open-source.
41
 
42
  ### Model Sources
43
 
44
- - **Repository:** [Project Repo](https://github.com/Imageomics/kabr-tools)
45
- - **Paper:** [Paper Link](https://link.springer.com/article/10.1007/s11263-025-02493-5)
 
46
  - **Project Page:** [BaboonLand Project Page](https://baboonland.xyz)
47
 
 
 
 
 
 
 
 
 
 
 
48
  ## Uses
49
 
50
- Baboon behavior recognition form in situ drone videos.
51
 
52
  ### Out-of-Scope Use
53
 
54
- This model was trained to detect and classify behavior from drone videos of baboons in Kenya. It may not perform well on other species or settings.
55
-
56
 
57
  ## How to Get Started with the Model
58
 
59
- Please see the illustrative examples in the [kabr-tools docs](https://imageomics.github.io/kabr-tools/)
60
- for more information on how this model can be used.
61
 
62
  ## Training Details
63
 
64
- We include the configuration file ([config.yaml](https://huggingface.co/imageomics/x3d-BaboonLand/blob/main/config.yaml)) utilized by SlowFast for X3D model training.
65
 
66
  ### Training Data
67
 
@@ -69,17 +74,17 @@ This model was trained on the [BaboonLand](https://huggingface.co/datasets/image
69
 
70
  #### Training Hyperparameters
71
 
72
- The model was trained for 120 epochs, using a batch size of 5.
73
- We used the EQL loss function to address the long-tailed class distribution and SGD optimizer with a learning rate of 1e5.
74
- We used a sample rate of 16x5, and random weight initialization.
75
 
76
  ## Evaluation
77
 
78
- The dataset was evaluated on the X3D-L model utilizing the [SlowFast](https://github.com/facebookresearch/SlowFast) framework, specifically utilizing the [test_net script](https://github.com/facebookresearch/SlowFast/blob/main/tools/test_net.py).
79
 
80
  ### Testing Data
81
 
82
- We provide a train-test split of the mini-scenes from the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) for evaluation purposes, with 75% for train and 25% for testing. No mini-scene was divided by the split.
83
 
84
  #### Metrics
85
 
@@ -87,17 +92,17 @@ We report Top-1, Top-3, and Top-5 macro-scores. For full details, please refer t
87
 
88
  **Micro-Average (Per Instance) Scores**
89
 
90
- | WI | BS | Top-1 | Top-3 | Top-5 |
91
- |----------|----|----------|----------|----------|
92
- | Random | 5 | **64.89** | **92.54**| **96.66**|
93
 
94
  ### Model Architecture and Objective
95
 
96
- Please see the [Base Model Description](https://arxiv.org/pdf/2004.04730).
97
 
98
  #### Hardware
99
 
100
- Running the X3D model requires a modern NVIDIA GPU with CUDA support. X3D-L is designed to be computationally efficient, and requires 10–16 GB of GPU memory during training.
101
 
102
  ## Citation
103
 
 
17
  - UAV
18
  - drone
19
  - video
20
+ model_description: "Behavior recognition model for in situ drone videos of baboons, built using an X3D model. It was trained on the BaboonLand mini-scene dataset, which is comprised of 20 hours of aerial video footage of baboons captured using a DJI Mavic 2S drone."
21
  ---
22
 
23
+ # Model Card for x3d-BaboonLand
24
+
25
+ x3d-BaboonLand is a behavior recognition model for in situ drone videos of baboons, built using the X3D architecture. It was trained on the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) dataset, which includes both spatiotemporal clips (mini-scenes) and behavior annotations provided by an expert behavioral ecologist.
 
 
 
26
 
27
  ## Model Details
28
 
29
  ### Model Description
30
 
31
  - **Developed by:** Isla Duporge, Maksim Kholiavchenko, Roi Harel, Scott Wolf, Daniel Rubenstein, Meg Crofoot, Tanya Berger-Wolf, Stephen Lee, Julie Barreau, Jenna Kline, Michelle Ramirez, Charles Stewart
 
32
  - **Model type:** X3D-L
33
  - **License:** MIT
34
  - **Fine-tuned from model:** [X3D-L](https://github.com/facebookresearch/SlowFast/blob/main/configs/Kinetics/X3D_L.yaml)
35
 
36
+ This model was developed for the benefit of the community as an open-source product; we request that derivative products also remain open-source.
37
 
38
  ### Model Sources
39
 
40
+ - **Repository:** [kabr-tools](https://github.com/Imageomics/kabr-tools)
41
+ - **BaboonLand scripts:** [BaboonLand/scripts](https://huggingface.co/datasets/imageomics/BaboonLand/tree/main/BaboonLand/scripts)
42
+ - **Paper:** [BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos](https://link.springer.com/article/10.1007/s11263-025-02493-5)
43
  - **Project Page:** [BaboonLand Project Page](https://baboonland.xyz)
44
 
45
+ ### Data Processing Software
46
+
47
+ The [kabr-tools](https://github.com/Imageomics/kabr-tools) repository is the primary open-source package used as the basis for processing and formatting data for behavior-recognition workflows. For BaboonLand, we did **not** duplicate the full codebase into this model repository. Instead, we used the `kabr-tools` workflow with BaboonLand-specific inputs and lightweight script adaptations.
48
+
49
+ In particular, several scripts used for BaboonLand were derived from `kabr-tools` utilities, but were adapted for this dataset and renamed for clarity. The resulting BaboonLand-specific scripts are provided here:
50
+
51
+ - [BaboonLand/scripts](https://huggingface.co/datasets/imageomics/BaboonLand/tree/main/BaboonLand/scripts)
52
+
53
+ These scripts document the dataset-specific preprocessing used for BaboonLand, while `kabr-tools` remains the main reference implementation for the broader workflow.
54
+
55
  ## Uses
56
 
57
+ This model is intended for baboon behavior recognition from in situ drone videos.
58
 
59
  ### Out-of-Scope Use
60
 
61
+ This model was trained to classify behavior from drone videos of baboons in Kenya. It may not perform well for other species, environments, camera viewpoints, annotation schemes, or behavior taxonomies.
 
62
 
63
  ## How to Get Started with the Model
64
 
65
+ Please see the illustrative examples in the [kabr-tools](https://imageomics.github.io/kabr-tools) for the general workflow.
 
66
 
67
  ## Training Details
68
 
69
+ We include the configuration file ([config.yaml](https://huggingface.co/imageomics/x3d-BaboonLand/blob/main/config.yaml)) used for X3D training in SlowFast.
70
 
71
  ### Training Data
72
 
 
74
 
75
  #### Training Hyperparameters
76
 
77
+ The model was trained for 120 epochs using a batch size of 5.
78
+ We used the EQL loss function to address the long-tailed class distribution and SGD optimization with a learning rate of `1e-5`.
79
+ We used a sample rate of `16x5` and random weight initialization.
80
 
81
  ## Evaluation
82
 
83
+ The model was evaluated using the [SlowFast](https://github.com/facebookresearch/SlowFast) framework, specifically the [test_net.py](https://github.com/facebookresearch/SlowFast/blob/main/tools/test_net.py) evaluation script.
84
 
85
  ### Testing Data
86
 
87
+ We provide a train-test split of the mini-scenes from the [BaboonLand](https://huggingface.co/datasets/imageomics/BaboonLand) dataset for evaluation, with 75% used for training and 25% for testing. No mini-scene was split across train and test partitions.
88
 
89
  #### Metrics
90
 
 
92
 
93
  **Micro-Average (Per Instance) Scores**
94
 
95
+ | WI | BS | Top-1 | Top-3 | Top-5 |
96
+ |---------|----|------:|------:|------:|
97
+ | Random | 5 | 64.89 | 92.54 | 96.66 |
98
 
99
  ### Model Architecture and Objective
100
 
101
+ Please see the [base model description](https://arxiv.org/pdf/2004.04730).
102
 
103
  #### Hardware
104
 
105
+ Running the X3D-L model requires a modern NVIDIA GPU with CUDA support. X3D-L is designed to be computationally efficient and typically requires 10–16 GB of GPU memory during training.
106
 
107
  ## Citation
108