Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting Lenet to n300 #13334

Open
2 of 7 tasks
saichandax opened this issue Oct 1, 2024 · 0 comments
Open
2 of 7 tasks

Porting Lenet to n300 #13334

saichandax opened this issue Oct 1, 2024 · 0 comments
Assignees

Comments

@saichandax
Copy link
Contributor

saichandax commented Oct 1, 2024

Executive Summary (as of Nov 5):

  • Single Device implementation is complete: (BLOCKED) due to pending approvals on #13473

    • PCC = 0.99
    • Torch ops: 2 maxpool (#13624, #12642)
    • BS = 8 (try higher batch size)
    • Perf numbers (with older profiler build):
      • E2E perf is 9.49 samples/sec
      • Device perf is 988.3 samples/sec
  • Data parallel implementation is complete: (BLOCKED) due to pending CIs and approvals on #13679

    • PCC = 0.99
    • Along with torch ops as in single device implementation, we have torch reshape and permute to overcome #13860 's low pcc issue
    • BS=16
    • Perf numbers (with old profiler build):
      • E2E perf is 4.47 samples/sec
      • Device perf is 3064.8 samples/sec
  • trace_2cqs implementation is not done yet. (BLOCKED due to torch ops in the model)

ToDo:

  • Check higher batch size, report new perf numbers and then merge single device implementation once we have approvals.
  • Same checks as in single device implementation (bs, perfs) and then merge once we have approvals and passing CIs
  • To implement perf with trace_2cqs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants