-
Notifications
You must be signed in to change notification settings - Fork 415
feature(zms): add new league middlewares and other models and tools. #458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hiha3456
wants to merge
288
commits into
opendilab:main
Choose a base branch
from
hiha3456:dev-distar-merge-into-main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
288 commits
Select commit
Hold shift + click to select a range
b404024
change a bit
hiha3456 ffcc50b
change a bit
hiha3456 a55fe17
change a bit
hiha3456 8084dae
change a bit
hiha3456 da46b22
change a bit
hiha3456 65b8762
change a bit
hiha3456 ad9632c
change position of policy_resetter, battle_rolloutor, battle_inferencer
hiha3456 192e6e4
polish stdim & infonce loss
lixl-st 46ee7cf
polish EventEnum
lixl-st 654eb9f
Merge branch 'main' of https://github.com/opendilab/DI-engine into de…
lixl-st 48ddc09
polish EventEnum
lixl-st 474524e
simplify the code
hiha3456 9d91063
for multiple policies
hiha3456 6ca4ae2
change a bit
hiha3456 2d85ccf
fix style
lixl-st 157b9b3
revert to drop the streaming type code
hiha3456 a13b5e6
fix codecov
lixl-st 9a25055
Merge branch 'main' of https://github.com/opendilab/DI-engine
lixl-st ecd0119
fix import
lixl-st 4d54abf
add readme
lixl-st 65525c1
drop useless codes
hiha3456 a1b0908
change rolloutor
hiha3456 68191a9
move on_league_job in call
hiha3456 59f2a6e
change __call__
hiha3456 2010965
solve conflicts
lixl-st 1382b7f
solve conflicts
lixl-st b546f8a
Merge branch 'dev-distar-actor' of https://github.com/opendilab/DI-en…
lixl-st cfabb47
add league coordinator
lixl-st 0ff864d
reformatting
hiha3456 4cf60f9
Merge branch 'main' into dev-distar-actor
hiha3456 3e71184
Merge branch 'dev-distar-actor' into dev-league-lxl
hiha3456 3f0dcb6
Merge pull request #345 from lixl-st/dev-league-lxl
hiha3456 6542a30
add new Enum event names inside actor and test
hiha3456 c7fac3f
add test coordinator & actor pipeline
hiha3456 49ccd6c
modify _on_learner_model method inside league_actor to make it thread…
hiha3456 6a2b76c
change test_league_pipeline.py to multiple process
hiha3456 aeac530
fix a problem of env initialization inside the creation of collector
hiha3456 0cd0e1d
polish league coordinator
lixl-st c0d5dd0
Merge branch 'dev-distar-actor' of https://github.com/opendilab/DI-en…
lixl-st fe26bb3
change "policies" to "current_policies" to make it clear
hiha3456 bd6395b
polish league coordinator
lixl-st f0601c6
Merge pull request #346 from lixl-st/dev-league-lxl
hiha3456 465ce9b
add streaming data collection
hiha3456 7ff8a0a
Merge branch 'dev-distar-actor' of https://github.com/opendilab/DI-en…
hiha3456 a6e9343
demo(nyz): add distar model
PaParaZz1 8093b72
feature(nyz): polish encoder and value related modules (ci skip)
PaParaZz1 83abc64
add locker and model_dict
hiha3456 604ed59
change name of vars and add BattleContext
hiha3456 2ad937d
debugging
hiha3456 bfd149d
update quick colab link
lixl-st b13d49e
Merge branch 'main' into main
lixl-st 1071dd1
drop commit "add BattleContext, policy_getter, policy_updater, modify…
hiha3456 310c279
change the logic of update model
hiha3456 d8a385d
add actor._get_current_policies and collector._update_policies
hiha3456 0795098
change variable names
hiha3456 8d295c0
Merge branch 'dev-league-lxl' of github.com:lixl-st/DI-engine into de…
lixl-st c4397b2
Merge branch 'dev-distar' of github.com:opendilab/DI-engine into dev-…
lixl-st d0df49e
merge from dev-distar-learn
lixl-st ab7b556
add league policy
lixl-st 1137a17
reformat
lixl-st 400fa60
Merge pull request #349 from opendilab/main
hiha3456 e9d3b47
polish(nyz): polish and test distar head
PaParaZz1 fd71c46
add learner to test pipeline
lixl-st 77dc67e
change vars names, get rid of cache_pool
hiha3456 f95113a
change the position of traj_buffers initialization
hiha3456 7d5c112
change a bit rolloutor
hiha3456 bc8c332
change the style to 1.0 middleware
hiha3456 1078673
change a bit
hiha3456 d3fcc51
change a bit
hiha3456 0dc1c5f
change variable name
hiha3456 5944584
add step collector and step actor
hiha3456 c171d59
feature&optim(zzh): add DDPPO & add model-based SAC with lambda-retur…
ZHZisZZ 2859423
add league learner
lixl-st 2aa05da
Merge branch 'dev-distar' of github.com:opendilab/DI-engine into dev-…
lixl-st 2c46dbb
solve conflicts
lixl-st db3afc2
change the distar_env to fit BaseEnvManager, write test_distar_env_wi…
hiha3456 3dc0b12
arrage tests
hiha3456 9b4deaa
format context and league_actor
hiha3456 c65fecf
make old test runnable
hiha3456 cd18955
change the distar_env to fit BaseEnvManager, write test_distar_env_wi…
hiha3456 6a3d433
arrage tests
hiha3456 e3c95ff
make old test runnable
hiha3456 34e6374
Merge branch 'dev-distar' of github.com:opendilab/DI-engine into dev-…
lixl-st 8027760
change random_action to a classmethod
hiha3456 cbe964c
change a bit DI-star env
hiha3456 fa07d12
add mock policy
lixl-st 7ad26ef
reformat
lixl-st 56ed8e1
fix a bug
hiha3456 b934260
modification to run DI-star in pipeline
hiha3456 400a19d
Merge pull request #350 from lixl-st/dev-league-lxl
hiha3456 e13a5f2
merge dev-distar branch and fix conflicts
hiha3456 3518a03
fix conflicts
hiha3456 9ae914a
change mocks
hiha3456 2d050d3
one_process test pass, multiple_process test failed because cannot pi…
hiha3456 82af279
change config
hiha3456 d7faf35
Merge pull request #361 from opendilab/dev-distar-collector
hiha3456 1e78af8
transform transitions so they could be sent, and write responding tes…
hiha3456 ece3a64
feature(nyz): add distar policy learn part
PaParaZz1 e9e8ebb
change format
hiha3456 f9f6ef6
Merge pull request #365 from opendilab/dev-distar-collector
hiha3456 22c8bcd
Merge branch 'dev-distar-learn' of github.com:opendilab/DI-engine int…
lixl-st d739635
Merge branch 'dev-distar' of github.com:opendilab/DI-engine into dev-…
lixl-st ffe8f7c
change print to show node_id; change format
hiha3456 b6c24b7
handle exception during reset SC2 env
hiha3456 78bed49
rolloutor handle error during step
hiha3456 2ad4850
add test for step exception
hiha3456 5685f3b
Merge pull request #372 from opendilab/dev-distar-collector
hiha3456 72dfa38
fix a bug of TransitionList, and simpify BattleContext
hiha3456 f6ec222
change a bit
hiha3456 b6129ed
change var agent_num
hiha3456 c16e9e7
make n_episode more clear
hiha3456 6fa1209
get rid of ctx.job
hiha3456 09b7145
polish(nyz): remove whole_cfg in distar model and fix bugs
PaParaZz1 06fee68
Merge pull request #375 from opendilab/dev-distar-collector
hiha3456 6d653a1
add timer
hiha3456 dc718dc
Merge pull request #376 from opendilab/dev-distar-collector
hiha3456 5d422d6
add log
hiha3456 c0d5a52
Merge branch 'dev-distar' of github.com:opendilab/DI-engine into dev-…
lixl-st ea3269d
Merge pull request #377 from opendilab/dev-distar-collector
hiha3456 1746905
add walltime data, change ActorData Structure
hiha3456 de5f8bb
Merge branch 'dev-distar' of github.com:opendilab/DI-engine into dev-…
lixl-st a30e23b
battle_transition_list, need to change list to deque
hiha3456 d7a2563
test(nyz): add distar policy learn unittest
PaParaZz1 81502a0
merge
lixl-st b4dcce0
merge
lixl-st 8f93810
merge main into dev-distar
hiha3456 6d89d09
Merge branch 'dev-distar' of github.com:opendilab/DI-engine into dev-…
hiha3456 96debf7
Merge branch 'dev-distar' into dev-distar-collector
hiha3456 ca0987b
change a bit
hiha3456 006132f
change a bit
hiha3456 c664f60
add comment to BattleTransitionList
hiha3456 78bfd97
change a bit
hiha3456 b64a3ec
add BattleTransitionList into league_actors, but have some unexpected…
hiha3456 8d30256
add league learner exchanger
lixl-st 0bd6e4a
deal with step error
hiha3456 338a2a2
change a bit env_supervisor so it could run DI-star env
hiha3456 65e3fb1
polish pipeline & add distar example
lixl-st a9f1fc0
Merge branch 'dev-distar' into dev-league-lxl
lixl-st 805d219
make pipelines run in supervisor
hiha3456 917a6e7
Merge pull request #388 from lixl-st/dev-league-lxl
hiha3456 716c55b
fix conflicts
hiha3456 87ac3ce
reformat
hiha3456 53bfdf5
feature(nyz): add basic distar policy collect(ci skip)
PaParaZz1 d467140
change init to make test_pipeline runnable
hiha3456 7ea23f6
adjust league_learner but cannot run
hiha3456 20d4c2b
add notes in conference
hiha3456 984735f
Merge pull request #392 from opendilab/dev-distar-collector
hiha3456 2e7fb1a
merge
lixl-st e50fc12
Merge branch 'dev-distar' into dev-distar-learn
hiha3456 31cd26a
Merge pull request #393 from opendilab/dev-distar-learn
hiha3456 eafdf9c
change commit position
hiha3456 c989cc0
Merge branch 'dev-distar' into dev-distar-collector
hiha3456 a061243
add z infos
hiha3456 f44f54d
adjust codes to run the pipeline
hiha3456 763b627
Merge pull request #396 from opendilab/dev-distar-fix-bug
hiha3456 2074302
merge
lixl-st 9c0d6f3
polish(nyz): polish parse_new_game and add transform_obs
PaParaZz1 37156d4
drop get config
hiha3456 c923c23
drop useless remain_episode and ready_env_ids
hiha3456 5db2b52
feature(zms): remove the episodes shorter than unroll_len
hiha3456 63ac2df
get game_info, map_name, map_size inside DIStarEnv.reset()
hiha3456 c1cc5af
final_eval_reward
hiha3456 6aefbe2
change a bit
hiha3456 b0a1051
add logging
lixl-st 4ef8ea8
rm exp files
lixl-st 7d3a152
add info["result"]
hiha3456 c7e0b4c
polish
lixl-st 7137660
Merge branch 'dev-distar' into dev-league-lxl
hiha3456 61c3b86
Merge pull request #398 from lixl-st/dev-league-lxl
hiha3456 723cbcc
remove rep
hiha3456 0bc4b20
add comment
hiha3456 02b9d0b
add result info in job.info
hiha3456 b72cb78
comment Episode Actor and Episode Collector
hiha3456 e12a82e
move battle_inferencer_for_distar, battle_rolloutor_for_distar to fun…
hiha3456 cb97628
remove rep
hiha3456 19aae09
change to fix bug on k8s
hiha3456 e9aad7e
change a bit data_processor.py
hiha3456 98effdc
make old tests could run
hiha3456 2d71278
feature(zms): check for 60s if get new model or not
hiha3456 94d750d
Merge pull request #402 from opendilab/dev-distar-collector
hiha3456 61883a7
change commit to run in k8s
hiha3456 def4799
Merge branch 'dev-distar-collector' into dev-distar
hiha3456 05cc0f0
fix(zms): fix the bug that when job begin, there is a infinite loop
hiha3456 8f716f1
Merge branch 'dev-distar-collector' into dev-distar
hiha3456 d6ef348
update train iter
hiha3456 0cb987f
change logic of update train_iter
hiha3456 7012966
add check of main player
hiha3456 7912613
fix bug
hiha3456 2b51afc
fix bug
hiha3456 c912e4b
change structure of map_size from list to point
hiha3456 83d93d0
test(nyz): add naive distar policy collect test
PaParaZz1 d96f953
Merge branch 'dev-distar' into dev-distar-collector
hiha3456 8408399
merge dev-distar-nyz
hiha3456 7b5b233
to run in k8s
hiha3456 0b985d2
change num workers
hiha3456 57d3a43
to run real policy forward_collect
hiha3456 584fd82
print exception
hiha3456 df21a9a
reformat test
hiha3456 72803ef
fix bug
hiha3456 e8e7551
tools to do serialization and test if two objects same
hiha3456 2da4010
changes in the model to correctly make actions using pretrained model
hiha3456 a0efe1c
changes to run the test using pretrained model
hiha3456 a3c9e39
tests to test the performance againist bot using pretrained mdoel
hiha3456 43f86a1
changes in the policy(agent) to correctly make actions using pretrain…
hiha3456 a99c8bb
move GLU and build_activation in action_type_head.py to ding/torch_ut…
hiha3456 a11e146
change default value of build_activation to False
hiha3456 ca651b5
add util to change ia's model
hiha3456 8dd3b39
add update_fake_reward; change behaviour to behavior
hiha3456 70c1a7f
Merge pull request #411 from opendilab/dev-distar-collector-merge-policy
hiha3456 b2cfa91
add processss_transition
hiha3456 25a8a54
load state_dict of teacher model and other debugs
hiha3456 6c75891
not delete last episode when before append, the first step of newest …
hiha3456 1f2bc17
insert process transition into rolloutor, fix bug; move self._observa…
hiha3456 8dfa961
fix bug of calling hamming_distance
hiha3456 4a01f3d
fix bug when calling levenshtein_distance
hiha3456 b06d6d8
fix bug of dimension selected_units
hiha3456 0aedb77
changes to run the whole pipeline from sl_model
hiha3456 1dfda42
changes to run the winrate test
hiha3456 3d19436
changes to make whole pipeline running bug freely
hiha3456 73c0367
Merge branch 'dev-distar-fix-bug' into dev-distar-collector-merge-policy
hiha3456 4fc95b1
Merge pull request #420 from opendilab/dev-distar-collector-merge-policy
hiha3456 206b293
feature(zms):updates for distar
hiha3456 146643e
merge branch 'main' into dev-distar-merge-into-main
hiha3456 ea11bc1
change comments and delete useless code
hiha3456 2e60364
move out the distar files into DI-star
2dc4ab5
move out tensor_dict_to_shm
2c0e527
add comment of CpuUnpickler
7a3bc0b
move out distar_test_pipelines
3f21d3c
change import of collector.py
37b6bc7
drop out useless mocks
de8b6eb
make test of coordinator pass
e63d710
feature(zms): add test_league_learner_communicator.py
30e6249
change a bit
9dc8770
change file name
117ae79
update test_handle_step_exception.py
f66248d
update test of BattleTransitionList, and add last_step_fn in BattleTr…
8cf4a82
uupdate tests; actor, collector, functional collector
9af8168
remove one todo
6221024
reformat
16281d9
reformat
fb83c8e
fix bug
bc7b477
reformat; add last_step_fn in entry of LeagueActor and BattleStepColl…
7fc8354
update tests
3e01213
delete useless files
0a8be2e
add unittest of flatten and detach_grad
hiha3456 b3d39f9
add comments about the difference between GLU2 and GLU
hiha3456 4f09857
add unittest of parameter "dim" in default_collate; remove dim from c…
hiha3456 54ea00e
reformat
hiha3456 ef64a7e
usee pytest of test_sparse_logging
hiha3456 543f1ec
change format of comment of sparse_logging
hiha3456 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
from .context import Context, OnlineRLContext, OfflineRLContext | ||
from .context import Context, OnlineRLContext, OfflineRLContext, BattleContext | ||
from .task import Task, task | ||
from .parallel import Parallel | ||
from .event_loop import EventLoop | ||
from .event_enum import EventEnum | ||
from .supervisor import Supervisor |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,3 +75,38 @@ def __init__(self, *args, **kwargs) -> None: | |
self.last_eval_iter = -1 | ||
|
||
self.keep('train_iter', 'last_eval_iter') | ||
|
||
|
||
class BattleContext(Context): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add overview comments |
||
|
||
def __init__(self, *args, **kwargs) -> None: | ||
super().__init__(*args, **kwargs) | ||
self.__dict__ = self | ||
# collect target paras | ||
self.n_episode = None | ||
|
||
#collect process paras | ||
self.env_episode = 0 | ||
self.env_step = 0 | ||
self.total_envstep_count = 0 | ||
self.train_iter = 0 | ||
self.collect_kwargs = {} | ||
self.current_policies = [] | ||
|
||
#job paras | ||
self.player_id_list = [] | ||
self.job_finish = False | ||
|
||
#data | ||
self.obs = None | ||
self.actions = None | ||
self.inference_output = {} | ||
self.trajectories = None | ||
|
||
#Return data paras | ||
self.episodes = [] | ||
self.episode_info = [] | ||
self.trajectories_list = [] | ||
self.train_data = None | ||
|
||
self.keep('train_iter') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
from enum import Enum, unique | ||
|
||
|
||
@unique | ||
class EventEnum(str, Enum): | ||
# events emited by coordinators | ||
COORDINATOR_DISPATCH_ACTOR_JOB = "on_coordinator_dispatch_actor_job_{actor_id}" | ||
|
||
# events emited by learners | ||
LEARNER_SEND_MODEL = "on_learner_send_model" | ||
LEARNER_SEND_META = "on_learner_send_meta" | ||
|
||
# events emited by actors | ||
ACTOR_GREETING = "on_actor_greeting" | ||
ACTOR_SEND_DATA = "on_actor_send_meta_player_{player}" | ||
ACTOR_FINISH_JOB = "on_actor_finish_job" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,7 @@ | ||
from .functional import * | ||
from .collector import StepCollector, EpisodeCollector | ||
from .collector import StepCollector, EpisodeCollector, BattleStepCollector | ||
from .learner import OffPolicyLearner, HERLearner | ||
from .ckpt_handler import CkptSaver | ||
from .league_actor import StepLeagueActor | ||
from .league_coordinator import LeagueCoordinator | ||
from .league_learner_communicator import LeagueLearnerCommunicator, LearnerModel |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
from typing import Any, List | ||
from dataclasses import dataclass, field | ||
|
||
#TODO(zms): simplify fields | ||
|
||
|
||
@dataclass | ||
class ActorDataMeta: | ||
player_total_env_step: int = 0 | ||
actor_id: int = 0 | ||
send_wall_time: float = 0.0 | ||
|
||
|
||
@dataclass | ||
class ActorEnvTrajectories: | ||
env_id: int = 0 | ||
trajectories: List = field(default_factory=[]) | ||
|
||
|
||
@dataclass | ||
class ActorData: | ||
meta: ActorDataMeta | ||
train_data: List[ActorEnvTrajectories] = field(default_factory=[]) | ||
|
||
|
||
@dataclass | ||
class PlayerModelInfo: | ||
get_new_model_time: float | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why two time here, maybe we can simplify them |
||
update_new_model_time: float | ||
update_train_iter: int = 0 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.