Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aborted (core dumped) in flow.nn.MovingAverageMinMaxObserver #10586

Open
x0w3n opened this issue Dec 5, 2024 · 0 comments
Open

Aborted (core dumped) in flow.nn.MovingAverageMinMaxObserver #10586

x0w3n opened this issue Dec 5, 2024 · 0 comments
Labels
bug community events from community

Comments

@x0w3n
Copy link

x0w3n commented Dec 5, 2024

Summary

A crash is triggered when quantization_formula is not "google" or "cambricon".

Code to reproduce bug

import oneflow as flow
import numpy as np

weight = (np.random.random((2, 3, 4, 5)) - 0.5).astype(np.float32)
input_tensor = flow.tensor(weight, dtype=flow.float32)
current_train_step_tensor = flow.tensor(np.zeros((1,), dtype=np.float32), dtype=flow.int64)
stop_update_after_iters = 0  
quantization_formula = "hack"  

moving_average_min_max_observer = flow.nn.MovingAverageMinMaxObserver(
    stop_update_after_iters=stop_update_after_iters,
    quantization_formula=quantization_formula,
    momentum=0.95,
    quantization_bit=8,
    quantization_scheme="symmetric",
)
(scale, zero_point) = moving_average_min_max_observer(input_tensor, current_train_step_tensor)

print("Scale:", scale.numpy())
print("Zero Point:", zero_point.numpy())

output:

F20241205 09:19:02.872363 2444755 moving_average_min_max_observer_kernel.cpp:156] UNIMPLEMENTED
*** Check failure stack trace: ***
    @     0x7f53f47d09ca  google::LogMessage::Fail()
    @     0x7f53f47d0cb2  google::LogMessage::SendToLog()
    @     0x7f53f47d0537  google::LogMessage::Flush()
    @     0x7f53f47d30a9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f53efcfe4d1  oneflow::CpuMovingAverageMinMaxObserverKernel<>::Compute()
    @     0x7f53f054e54d  oneflow::one::StatefulOpKernel::Compute()
    @     0x7f53ee7e8cab  oneflow::vm::OpCallInstructionUtil::Compute()
    @     0x7f53ee7e6787  oneflow::vm::OpCallInstructionPolicy::Compute()
    @     0x7f53ee7e25bc  oneflow::vm::Instruction::Compute()
    @     0x7f53ee7e0a6f  oneflow::vm::EpStreamPolicyBase::Run()
    @     0x7f53ee7ec086  oneflow::vm::StreamPolicy::RunIf()
    @     0x7f53ee7f36de  oneflow::vm::ThreadCtx::TryReceiveAndRun()
    @     0x7f53ee7f5d2d  oneflow::(anonymous namespace)::WorkerLoop()
    @     0x7f53ee7f611f  _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJPFvPN7oneflow2vm9ThreadCtxERKSt8functionIFvS6_EEES6_ZNS3_14VirtualMachine15CreateThreadCtxENS3_6SymbolINS3_6DeviceEEENS3_10StreamTypeEmEUlS6_E3_EEEEE6_M_runEv
    @     0x7f53f47e540f  execute_native_thread_routine
    @     0x7f54dc321b43  (unknown)
    @     0x7f54dc3b3a00  (unknown)
Aborted (core dumped)

System Information

  • What is your OneFlow installation (pip, source, dockerhub): pip
  • OS: Ubuntu 22.04.3 LTS
  • OneFlow version (run python3 -m oneflow --doctor):
path: ['/home/miniconda3/envs/oneflow/lib/python3.9/site-packages/oneflow']
version: 0.9.0
git_commit: 381b12c
cmake_build_type: Release
rdma: True
mlir: True
  • Python version: 3.9.13
  • CUDA driver version: 12.2
  • GPU models: NVIDIA GeForce RTX 4090
  • Other info: None
@x0w3n x0w3n added bug community events from community labels Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug community events from community
Projects
None yet
Development

No branches or pull requests

1 participant