Skip to content

Commit cdfd364

Browse files
committed
adding support for arkitscenes
1 parent b9d92fa commit cdfd364

27 files changed

+1191
-10
lines changed

DATA.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ We list the available data used in the current version of CrossOver in the table
1010
| ------------ | ----------------------------- | ----------------------------------- | -------------------------- | -------------------------- |
1111
| ScanNet | `[point, rgb, cad, referral]` | `[point, rgb, floorplan, referral]` |||
1212
| 3RScan | `[point, rgb, referral]` | `[point, rgb, referral]` |||
13+
| ARKitScenes | `[point, rgb, referral]` | `[point, rgb, referral]` |||
1314

1415

1516
We detail data download and release instructions for preprocessing with scripts for ScanNet + 3RScan.
@@ -110,4 +111,37 @@ Scan3R/
110111
| │ ├── objectsDataMultimodal.pt -> object data combined from data1D.pt + data2D.pt + data3D.pt (for easier loading)
111112
| │ └── sel_cams_on_mesh.png (visualisation of the cameras selected for computing RGB features per scan)
112113
| └── ...
114+
```
115+
116+
### ARKitScenes
117+
118+
#### Running preprocessing scripts
119+
Adjust the path parameters of `ARKitScenes` in the config files under `configs/preprocess`. Run the following (after changing the `--config-path` in the bash file):
120+
121+
```bash
122+
$ bash scripts/preprocess/process_arkit.sh
123+
```
124+
125+
Our script for ARKitScenes dataset performs the following additional processing:
126+
127+
- 3D-to-2D projection for 2D segmentation and stores as `gt-projection-seg.pt` for each scan.
128+
129+
Post running preprocessing, the data structure should look like the following:
130+
131+
```
132+
ARKitScenes/
133+
├── objects_chunked/ (object data chunked into hdf5 format for instance baseline training)
134+
| ├── train_objects.h5
135+
| └── val_objects.h5
136+
├── scans/
137+
| ├── 40753679/
138+
| │ ├── gt-projection-seg.pt -> 3D-to-2D projected data consisting of framewise 2D instance segmentation
139+
| │ ├── data1D.pt -> all 1D data + encoded (object referrals + BLIP features)
140+
| │ ├── data2D.pt -> all 2D data + encoded (RGB + floorplan + DinoV2 features)
141+
| │ ├── data2D_all_images.pt (RGB features of every image of every scan )
142+
| │ ├── data3D.pt -> all 3D data + encoded (Point Cloud + I2PMAE features - object only)
143+
| │ ├── object_id_to_label_id_map.pt -> Instance ID to NYU40 Label mapped
144+
| │ ├── objectsDataMultimodal.pt -> object data combined from data1D.pt + data2D.pt + data3D.pt (for easier loading)
145+
| │ └── sel_cams_on_mesh.png (visualisation of the cameras selected for computing RGB features per scan)
146+
| └── ...
113147
```

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ See [DATA.MD](DATA.md) for detailed instructions on data download, preparation a
117117
| ------------ | ----------------------------- | ----------------------------------- | -------------------------- | -------------------------- |
118118
| Scannet | `[point, rgb, cad, referral]` | `[point, rgb, floorplan, referral]` |||
119119
| 3RScan | `[point, rgb, referral]` | `[point, rgb, referral]` |||
120+
| ARKitScenes | `[point, rgb, referral]` | `[point, rgb, referral]` |||
120121

121122
> To run our demo, you only need to download generated embedding data; no need for any data preprocessing.
122123

TRAIN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ $ bash scripts/train/train_instance_crossover.sh
2121
```
2222

2323
#### Train Scene Retrieval Pipeline
24-
Adjust path/configuration parameters in `configs/train/train_scene_crossover.yaml`. You can also add your customised dataset or choose to train on Scannet & 3RScan or either. Run the following:
24+
Adjust path/configuration parameters in `configs/train/train_scene_crossover.yaml`. You can also add your customised dataset or choose to train on Scannet, 3RScan & ARKitScenes or any combination of the same. Run the following:
2525

2626
```bash
2727
$ bash scripts/train/train_scene_crossover.sh

configs/evaluation/eval_instance.yaml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,23 @@ data :
4343
max_object_len : 150
4444
voxel_size : 0.02
4545

46+
ARKitScenes:
47+
base_dir : /Users/gauravpradeep/CrossOver_ScaleUp/ARKitScenes
48+
process_dir : ${data.process_dir}/ARKitScenes/scans
49+
processor3D : ARKitScenes3DProcessor
50+
processor2D : ARKitScenes2DProcessor
51+
processor1D : ARKitScenes1DProcessor
52+
avail_modalities : ['point', 'cad', 'rgb', 'referral']
53+
max_object_len : 150
54+
voxel_size : 0.02
55+
4656
task:
4757
name : InferenceObjectRetrieval
4858
InferenceObjectRetrieval:
4959
val : [Scannet]
5060
modalities : ['rgb', 'point', 'cad', 'referral']
5161
scene_modalities : ['rgb', 'point', 'referral', 'floorplan']
52-
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/instance_crossover_scannet+scan3r.pth
62+
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/instance_crossover_scannet+scan3r+arkit.pth
5363

5464

5565
inference_module: ObjectRetrieval

configs/evaluation/eval_scene.yaml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,23 @@ data :
4343
max_object_len : 150
4444
voxel_size : 0.02
4545

46+
ARKitScenes:
47+
base_dir : /Users/gauravpradeep/CrossOver_ScaleUp/ARKitScenes
48+
process_dir : ${data.process_dir}/ARKitScenes/scans
49+
processor3D : ARKitScenes3DProcessor
50+
processor2D : ARKitScenes2DProcessor
51+
processor1D : ARKitScenes1DProcessor
52+
avail_modalities : ['point', 'cad', 'rgb', 'referral']
53+
max_object_len : 150
54+
voxel_size : 0.02
55+
4656
task:
4757
name : InferenceSceneRetrieval
4858
InferenceSceneRetrieval:
4959
val : [Scannet]
5060
modalities : ['rgb', 'point', 'cad', 'referral']
5161
scene_modalities : ['rgb', 'point', 'referral', 'floorplan'] #, 'point']
52-
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/scene_crossover_scannet+scan3r.pth
62+
ckpt_path : /drive/dumps/multimodal-spaces/runs/release_runs/scene_crossover_scannet+scan3r+arkit.pth
5363

5464
inference_module: SceneRetrieval
5565
model:

configs/preprocess/process_1d.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,14 @@ data:
2525
label_filename : labels.instances.align.annotated.v2.ply
2626
skip_frames : 1
2727

28+
ARKitScenes:
29+
base_dir : /Users/gauravpradeep/CrossOver_ScaleUp/ARKitScenes
30+
process_dir : ${data.process_dir}/ARKitScenes/scans
31+
processor3D : ARKitScenes3DProcessor
32+
processor2D : ARKitScenes2DProcessor
33+
processor1D : ARKitScenes1DProcessor
34+
skip_frames : 1
35+
2836
Shapenet:
2937
base_dir : /drive/datasets/Shapenet/ShapeNetCore.v2/
3038

configs/preprocess/process_2d.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,14 @@ data:
2727
label_filename : labels.instances.align.annotated.v2.ply
2828
skip_frames : 1
2929

30+
ARKitScenes:
31+
base_dir : /Users/gauravpradeep/CrossOver_ScaleUp/ARKitScenes
32+
process_dir : ${data.process_dir}/ARKitScenes/scans
33+
processor3D : ARKitScenes3DProcessor
34+
processor2D : ARKitScenes2DProcessor
35+
processor1D : ARKitScenes1DProcessor
36+
skip_frames : 1
37+
3038
modality_info:
3139
1D :
3240
feature_extractor:

configs/preprocess/process_3d.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,14 @@ data:
2424
processor1D : Scan3R1DProcessor
2525
label_filename : labels.instances.align.annotated.v2.ply
2626

27+
ARKitScenes:
28+
base_dir : /Users/gauravpradeep/CrossOver_ScaleUp/ARKitScenes
29+
process_dir : ${data.process_dir}/ARKitScenes/scans
30+
processor3D : ARKitScenes3DProcessor
31+
processor2D : ARKitScenes2DProcessor
32+
processor1D : ARKitScenes1DProcessor
33+
skip_frames : 1
34+
2735
modality_info:
2836
1D :
2937
feature_extractor:

configs/preprocess/process_multimodal.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,15 @@ data:
2828
skip_frames : 1
2929
avail_modalities : ['point', 'rgb', 'referral']
3030

31+
ARKitScenes:
32+
base_dir : /Users/gauravpradeep/CrossOver_ScaleUp/ARKitScenes
33+
process_dir : ${data.process_dir}/ARKitScenes/scans
34+
chunked_dir : ${data.process_dir}/ARKitScenes/objects_chunked
35+
processor3D : ARKitScenes3DProcessor
36+
processor2D : ARKitScenes2DProcessor
37+
processor1D : ARKitScenes1DProcessor
38+
avail_modalities : ['point', 'rgb', 'referral']
39+
3140
modality_info:
3241
1D :
3342
feature_extractor:

configs/train/train_instance_baseline.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,17 @@ data :
4444
max_object_len : 150
4545
voxel_size : 0.02
4646

47+
ARKitScenes:
48+
base_dir : /Users/gauravpradeep/CrossOver_ScaleUp/ARKitScenes
49+
process_dir : ${data.process_dir}/ARKitScenes/scans
50+
chunked_dir : ${data.process_dir}/ARKitScenes/objects_chunked
51+
processor3D : ARKitScenes3DProcessor
52+
processor2D : ARKitScenes2DProcessor
53+
processor1D : ARKitScenes1DProcessor
54+
avail_modalities : ['point', 'rgb', 'referral']
55+
max_object_len : 150
56+
voxel_size : 0.02
57+
4758
task:
4859
name : ObjectLevelGrounding
4960
ObjectLevelGrounding :

0 commit comments

Comments
 (0)