-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
bugSomething isn't workingSomething isn't working
Description
As a reproducer, in a conda-forge controlled environment:
mkdir blah; cd blah
# the best recipe..
wget https://atomistic-cookbook.org/_downloads/72b9bec1c6e219fe3a0fb83fa52b668b/eon-pet-neb.zip
unzip eon-pet-neb.zip
conda env create --file environment.yml -p $(pwd)/.tmp
conda activate $(pwd)/.tmpSo far so good. Also works with the PET-MAD stuff on upet, as seen in lab-cosmo/atomistic-cookbook#212
However, the OMAD models fail terribly. Generate the inputs..
python eon-pet-neb.py # takes a minute or less
# use a newer metatrain
uvx --from metatrain mtt export https://huggingface.co/lab-cosmo/upet/resolve/main/models/pet-omad-xs-v1.0.0.ckptMake the change in config.ini, i.e.
[Metatomic]
model_path = pet-omad-xs-v1.0.0.ptNow a fresh run..
rm -rf neb* *.log
eonclient
Floating point exception: Overflow
Aborted (core dumped)Which can be expanded to:
GDB trace
[New Thread 0x7fffa2ffb240 (LWP 778554)]
Thread 1 "eonclient" received signal SIGFPE, Arithmetic exception.
0x00007fffdca9f3a5 in Sleef_finz_expf8_u10avx2 () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/./libsleef.so.3
(gdb) bt
#0 0x00007fffdca9f3a5 in Sleef_finz_expf8_u10avx2 () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/./libsleef.so.3
#1 0x00007fffebe5062d in void c10::function_ref<void (char**, long const*, long, long)>::callback_fn<at::native::AVX2::VectorizedLoop2d<at::native::(anonymous namespace)::silu_kernel(at::TensorIteratorBase&)::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda(float)#1}, at::native::(anonymous namespace)::silu_kernel(at::TensorIteratorBase&)::{lambda()#2}::operator()() const::{lambda()#2}::operator()() const::{lambda(at::vec::AVX2::Vectorized<float>)#1}> >(long, char**, long const*, long, long) () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#2 0x00007fffe571d369 in at::TensorIteratorBase::serial_for_each(c10::function_ref<void (char**, long const*, long, long)>, at::Range) const ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#3 0x00007fffe571da80 in at::TensorIteratorBase::for_each(c10::function_ref<void (char**, long const*, long, long)>, long) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#4 0x00007fffebe954ce in at::native::(anonymous namespace)::silu_kernel(at::TensorIteratorBase&) () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#5 0x00007fffe6ce9789 in at::(anonymous namespace)::wrapper_CPU_silu(at::Tensor const&) () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#6 0x00007fffe6ce995e in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&), &at::(anonymous namespace)::wrapper_CPU_silu>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&> >, at::Tensor (at::Tensor const&)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#7 0x00007fffe6997b86 in at::_ops::silu::redispatch(c10::DispatchKeySet, at::Tensor const&) () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#8 0x00007fffe94a7233 in torch::autograd::VariableType::(anonymous namespace)::silu(c10::DispatchKeySet, at::Tensor const&) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#9 0x00007fffe94a781d in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::silu>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#10 0x00007fffea6d9566 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const [clone .isra.0] ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#11 0x00007fffea241d28 in bool torch::jit::InterpreterStateImpl::runTemplate<false>(std::vector<c10::IValue, std::allocator<c10::IValue> >&) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#12 0x00007fffea247ce5 in torch::jit::InterpreterStateImpl::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#13 0x00007fffea222e76 in torch::jit::GraphExecutorImplBase::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#14 0x00007fffe9e96433 in torch::jit::Method::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&) const () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libtorch_cpu.so
#15 0x00007ffff67edf8d in MetatomicPotential::force(long, double const*, int const*, double*, double*, double*, double const*) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libmetatomic_pot.so
#16 0x00007ffff7f6f103 in Potential::get_ef(Eigen::Matrix<double, -1, 3, 1, -1, 3>, Eigen::Matrix<int, -1, 1, 0, -1, 1>, Eigen::Matrix<double, 3, 3, 1, 3, 3>) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libeonclib.so
#17 0x00007ffff7ee15f8 in Matter::computePotential() () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libeonclib.so
#18 0x00007ffff7ee28da in Matter::getPotentialEnergy() () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libeonclib.so
#19 0x00007ffff7f1c31a in NudgedElasticBand::NudgedElasticBand(std::vector<Matter, std::allocator<Matter> >, std::shared_ptr<Parameters>, std::shared_ptr<Potential>) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libeonclib.so
#20 0x00007ffff7f1ca09 in NudgedElasticBand::NudgedElasticBand(std::shared_ptr<Matter>, std::shared_ptr<Matter>, std::shared_ptr<Parameters>, std::shared_ptr<Potential>) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libeonclib.so
#21 0x00007ffff7f32e99 in std::__detail::_MakeUniq<NudgedElasticBand>::__single_object std::make_unique<NudgedElasticBand, std::shared_ptr<Matter>&, std::shared_ptr<Matter>&, std::shared_ptr<Parameters>&, std::shared_ptr<Potential>&>(std::shared_ptr<Matter>&, std::shared_ptr<Matter>&, std::shared_ptr<Parameters>&, std::shared_ptr<Potential>&) ()
from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libeonclib.so
#22 0x00007ffff7f362a6 in NudgedElasticBandJob::run[abi:cxx11]() () from /home/goswami/Git/Github/epfl/lab-cosmo/pixi_envs/atomistic-cookbook/atomistic-cookbook/.nox/eon-pet-neb/bin/../lib/libeonclib.so
#23 0x0000555555564409 in main ()However, @Luthaf pointed out that the sleef symbols are only linked in the conda variant, and indeed, a separate environment where everything is managed with pip and eonclient is source installed works just fine.
SPICE and PET-MAD models work fine though. Table incoming.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working