Skip to content

Conversation

@tmikus
Copy link

@tmikus tmikus commented Jun 10, 2021

This PR also fixes a Segmentation Fault when calling the detect function on Macs running M1 chip.
The array returned from the PyArg_ParseTupleAndKeywords was being set to an incorrect address, which caused a segmentation fault. Surprisingly, changing the call so that it requests a null-terminated string fixes the problem.

I tried using other forms of that function (s*, y*, y#) but to no avail.
I suspect this might be a problem with Python (3.9.5 for Apple Silicon).

One thing to remember is that the detect function will no longer accept byte-like objects. It will still work fine with utf-8 strings.

@NickHilton
Copy link

This works well for me on Mac M1. Is there a way this can be merged / released as a separate version to pypi?

@gabrielcabola
Copy link

Hey, where this will be merged?
This is still relevant as an issue for M1 machines.
Thanks

@akgerber
Copy link

akgerber commented Mar 29, 2022

Thanks for posting this @tmikus — made my day much easier

here's a script to apply this fix in a virtualenv located at path $MYVIRTUALENV:

mkdir $MYVIRTUALENV/src
cd $MYVIRTUALENV/src
wget https://files.pythonhosted.org/packages/21/d2/8b0def84a53c88d0eb27c67b05269fbd16ad68df8c78849e7b5d65e6aec3/pycld2-0.41.tar.gz
tar -xzvf pycld2-0.41.tar.gz
cd pycld2-0.41
wget https://patch-diff.githubusercontent.com/raw/aboSamoor/pycld2/pull/44.patch
patch -p1 < 44.patch
CFLAGS="-w -O2 -fPIC -march=armv8-a" pip install --no-cache-dir --no-deps --no-build-isolation  .

@cotterpl
Copy link

@aboSamoor why this is still not solved?

@leegunwoo98
Copy link

still getting an error

@jurifm2406
Copy link

got an error

error: unknown target CPU 'armv8-a'

on apple m1 arm64

@gregoriopellegrino
Copy link

You can install it using:

pip install git+https://github.com/tmikus/pycld2.git

@puja93
Copy link

puja93 commented Mar 13, 2024

Thanks @tmikus , your PR works fine on my M1 for utf-8 but not byte object, as you mentioned.
It has been sometime, but i do wonder if you've found a solution to include byte object as well ?

I'm currently using polyglot that depends on this. Currently i just switch this line in order to make it work.
cld2.detect(t) -> cld2.detect(t.decode('utf-8')

So i wonder if you've made new findings on byte objects....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants