STARS ✨

This is an official PyTorch implementation of STARS: Self-supervised Tuning for 3D Action Recognition in Skeleton Sequences.

Data Preparation ⚙️

Follow the same steps explained in MAMP repository and put the result in ./data directory. After data preparation, the following data structure is expected:

.
└── data/
    ├── NTU60_XSub.npz
    ├── NTU60_XView.npz
    ├── NTU120_XSub.npz
    ├── NTU120_XSet.npz
    └── PKUv2_XSub.npz

Self-supervised Pre-Training 📉

First stage

First stage is trained using the MAMP method. You can refer to their repository and download the model weights. Create a new directory in root directory of project called pretrained_weights and place their weights by adding mamp at the beginning of each file. After doing this step it's expected to have the following file structure:

.
├── config
├── data
├── ...
└── pretrained_weights/
    ├── mamp_ntu60_xsub.pth
    ├── mamp_ntu60_xview.pth
    ├── mamp_ntu120_xsub.pth
    └── mamp_ntu120_xview.pth

Note: Since they did not provide weights for pkl experiment, we retrained using our own systems. That's why the result we report on paper is different than what they report for a fair comparison.

Second stage

You can tune the weights of MAMP by running the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun \
                             --nnodes=1 
                             --nproc_per_node=4 
                             main.py \
                             pretrain_clrtuning \
                             --config config/ntu60/pretext/clrtuning_xsub_merged.yaml \
                             --use-wandb \
                             --wandb-name CLRTuning-XSub

In the example above, 0,1,2,3 are the GPU IDs since we used 4 GPUs (Change it to 0 if you have a single GPU) and --nproc_per_node=4 is the number of GPUs. Additionally, since we use wandb for logging, we set --use-wandb to upload the logs to server and --wandb-name as for the Run ID.

You can also run for the XView evaluation by changing the config to clrtuning_xview_second.yaml. Upon completion, go inside the work_dir and place the best ckpt to pretrained_weights with the naming expected in config file of third stage.

Alternatively, you can download the weights from here

Tuning and Evaluation 📊

For all the evaluations, we use the same training code as the MAMP for a fair comparison. We have placed the modified version of their repository here under MAMP directory. You can run the evaluations similiar to what stated in the provided bash files. e.g. you can run the semi-supervised training for NTU-60 XView as follows:

python3 -m torch.distributed.launch --nproc_per_node=1 --master_port 12345 main_finetune.py \
        --config ./config/ntu60_xview_joint/semi_0.1_t120_layer8_decay.yaml \
        --output_dir ./output_dir/ntu60_xview/semi1_1/ \
        --resume  ./output_dir/ntu60_xview/semi1_1/checkpoint-last.pth \
        --log_dir ./output_dir/ntu60_xview/semi1_1_log/ \
        --finetune ../pretrained_weights/clrtuning_thirdstage_ntu60_view.pt \
        --dist_eval \
        --lr 3e-4 \
        --min_lr 1e-5

Notes:

In the config yaml files, data_path values are started with /path/to/ that you need to replace them with place where data is located.
resume in command above is for the case that your code is preempted in the middle and you want to resume from last checkpoint. If not exists, it starts from the beginning.

K-NN and Few-shot evaluation

For K-NN evaluation, you can do it as follows:

python3 knn_eval.py \
        --data-path <PATH-TO-DATA> \
        --checkpoint <PATH-TO-MODEL-WEIGHT>

For the case of Few-shot evaluation, you need to download the data from here. The data consists of all NTU-120 XSub samples except those that are already present in NTU-60 XSub. For checkpoint you have to use the weight of NTU-60 XSub model.

Results 📜

You can find the pretrained weights here.

Protocols	NTU-60 X-sub	NTU-60 X-view	NTU-120 X-sub	NTU-120 X-set
Linear	87.1	90.9	79.9	80.8
KNN (k=1)	79.9	88.6	67.6	67.7
Finetune	93.0	97.5	89.9	91.4

Acknowledgement 🙏

The framework of our code is extended from the following repositories. We sincerely thank the authors for releasing the codes.

The framework of our code is based on ActCLR.
The encoder is based on MAMP.

Licence

This project is licensed under the terms of the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
MAMP		MAMP
config		config
data		data
feeder		feeder
loss		loss
net		net
processor		processor
tools		tools
torchlight		torchlight
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
draw_confusion_matrix.py		draw_confusion_matrix.py
ensemble_ntu_cs.py		ensemble_ntu_cs.py
ensemble_ntu_cv.py		ensemble_ntu_cv.py
knn_eval.py		knn_eval.py
main.py		main.py
requirements.txt		requirements.txt
tsne_visualization.py		tsne_visualization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STARS ✨

Data Preparation ⚙️

Self-supervised Pre-Training 📉

First stage

Second stage

Tuning and Evaluation 📊

K-NN and Few-shot evaluation

Results 📜

Acknowledgement 🙏

Licence

About

Releases

Packages

Languages

License

TaatiTeam/STARS

Folders and files

Latest commit

History

Repository files navigation

STARS ✨

Data Preparation ⚙️

Self-supervised Pre-Training 📉

First stage

Second stage

Tuning and Evaluation 📊

K-NN and Few-shot evaluation

Results 📜

Acknowledgement 🙏

Licence

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages