Giter VIP home page Giter VIP logo

Comments (28)

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Hi Batte, yes, the problem is that it's converting everything to ANSI in Windows, which it is not the best solution. I'll change this to convert everything to UTF-8, but you'll not see the ♫ unless you change the command line default code page since it is using ANSI by default. You can change the codepage to UTF-8 with "chcp 65001", but you also will need to change the default font, for someone that support this characters ( like Lucida Console ).
You can also compile with UNICODE support, if you're using Visual Studio, go to the project properties -> Configuration Properties -> General -> Character Set -> Use unicode character set. But, since i'll compile this change to work always with UTF-8, it'll be exactly the same.
Thanks for reporting it!

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Fixed in #a4bca78

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


Hello,

I understand but the default encoding of QtCreator's command line page (IDE I use) does not seem to be ANSI, it's the reason why I'm able to see special chars like ♫.

About your new encoding choice (UTF-8), are you sure that's the correct encoding of Windows ? I thouth that's UTF-16 ... Maybe you're right, I'm actually not sure !

Thanks,

B.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Yes, Windows encourage the use of UTF-16 as the default encoding, but it's not a requirement, since it supports any Unicode method. I can't use UTF-16 because i'm using std::string to keep it simple, and the other OSes use UTF-8, so the correct approach is to always use the same encoding. There's nothing impeding yo to convert the strings to any other encoding, and you can use the String class used internally by efsw ( efsw::String::fromUtf8( filename ).toWideString() ).
I also use QtCreator, the application output in other OSes is set to UTF-8, and looks fine, but i don't know what's using in Windows, i tried printing UTF-16 and i doesn't seems to work neither. I don't have time to continue testing, but it's not something that i care much about, it's not a problem of efsw.
Sadly i can satisfy every developer, but if you want to suggest other solution, i'm listening!

Edit:
Hint: Read this https://bugreports.qt-project.org/browse/QTCREATORBUG-316
Try with calling this: http://msdn.microsoft.com/en-us/library/windows/desktop/ms686036(v=vs.85).aspx with the correct codepage ( 65001 ).

Regards,
Martín

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


Hello,

All UTF-8 chars seem to be correctly sent by EFSW.

But I have some problems with others files (files with filenames originating from Mac OS X --> created directly on a Mac and correctly printed by Finder and Explorer Windows).

One of them :
https://mega.co.nz/#!FR1UQBDa!TzRLQ210dwpE2KQGUNz__1fWcOsvo19VFLed-2YujLE

On Windows, if you edit its filename, after you copy the filename and you put it on a text plain editor like Notepad, you'll see some specific chars ! And these chars are not correctly sent by EFSW ... Do you know why ?

I hope I'm understandable ...

Do not hesitate to tell me if I'm not.

B.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Sorry, but i'm not sure what are you trying to say.
May be you can explain me step by step how to reproduce the problem and try to be a little more clear explaining what's the problem. Because what i understand it doesn't sound like an efsw bug.
Thanks,
Martín

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


OK, sorry to be not understandable.

It's simple. As for "test♫.xlsx", I tried to put the file below into the watched folder :
https://mega.co.nz/#!FR1UQBDa!TzRLQ210dwpE2KQGUNz__1fWcOsvo19VFLed-2YujLE

Result ? As for "test?.xlsx" resulting from EFSW, I received a wrong filename for this new file.

Do you know why ?

For information : after some investigations, I understood that the file was coming from Mac OS X and had some strange chars on its filename (you can see them by copying the filename into Notepad (or other text plain editors ...).

B.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


This is what i explained in the previous messages, this is not an efsw problem, you need to use a command line that supports UTF-8 character encoding, with a font that also supports it. Or you can change the output to an encoding that the command line interprets correctly.
For example, i used cygwin console to show you that this is already working ( since it use UTF-8 by default ): example working.
There are some other options, just search something like "windows unicode command line support" in Google.
Regards,
Martín

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


Hello,

I've also a char encoding problem on Mac OS. Indeed, if I put a folder "lolélalé" into a watched folder, I'll get "lole'lale'" ...

For help, this is a part of my code :

case efsw::Actions::Add:
                std::cout << "DIR (" << dir << ") FILE (" << filename << ") has event Added" << std::endl;
                _filewatchersignals->emit_addSignal(QString::fromUtf8(dir.c_str()), QString::fromUtf8(filename.c_str()));
                break;
            case efsw::Actions::Delete:
                std::cout << "DIR (" << dir << ") FILE (" << filename << ") has event Delete" << std::endl;
                _filewatchersignals->emit_deleteSignal(QString::fromUtf8(dir.c_str()), QString::fromUtf8(filename.c_str()));
                break;
            case efsw::Actions::Modified:
                std::cout << "DIR (" << dir << ") FILE (" << filename << ") has event Modified" << std::endl;
                _filewatchersignals->emit_modifiedSignal(QString::fromUtf8(dir.c_str()), QString::fromUtf8(filename.c_str()));
                break;
            case efsw::Actions::Moved:
                std::cout << "DIR (" << dir << ") FILE (" << filename << ") has event Moved from (" << oldFilename << ")" << std::endl;
                _filewatchersignals->emit_movedSignal(QString::fromUtf8(dir.c_str()), QString::fromUtf8(filename.c_str()), QString::fromUtf8(oldFilename.c_str()));
                break;
            default:
                std::cout << "Should never happen!" << std::endl;

You can see that I correctly take the outside with UTF8 encoding ...

Thanks for your help.

B.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


You have the same problem than in Windows, your locale it's not correctly set in the Terminal. I've tested with the default terminal locale ( en_US.UTF-8 ) and everything works just fine. Also works in the application output from QtCreator. Your code looks fine, so i don't thing there's nothing wrong there.
OS X and UTF-8 example

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


Hello,

I understand you're saying but I think that my problem is another thing.

Indeed, my Qt project is a client which talks with a web service.
On Windows, when EFSW gives a file "tété.txt" to the Qt client (which sends it), the web service receives correctly the file (with the same file name "tété.txt").
On Mac OS, when EFSW gives the same file name, the web service receives a wrong filename.

I have looked the decimal value of each char of the file name sent by EFSW and it doesn't seem to be the UTF-8 decimal value of "t" and "é" chars.

Do you know what I mean ?

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Yes, it's clear what you are describing. I'll compare the string hash produced on OS X and Windows, if something is different, means that efsw is doing something wrong, otherwise it should be something of your application.

Let me see and i'll tell you.

Thanks

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Ok, i made the tests and it looks everything fine. The string hashes are the same, the binary data is exactly the same. I still think that this is not an efsw issue, if you can reproduce it with a simple example that i can test here, i'll take a look at it. But, please nothing with Qt or client/server, since it has nothing to do with the library.

OS X hashes
Windows 7 hashes

Regards

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


Hello,

I'm sorry but I have still some problems about EFSW encoding.

I have print in hexadecimal the string that EFSW gives after an event occured.
The result is :

#!

DIR (/Users/bb/MCF/[email protected]/Privévè/Coffre-fort/aabbcc/) FILE (pépè.png(0x7065ffffffccffffff817065ffffffccffffff802e706e67)) has event Added

As you can see, all caracters are encoded in Unicode UTF-8 ...

  • "p" : 0x70
  • "." : 0x2e
  • "n" : 0x6e
  • "g" : 0x67

... EXCEPT "é" and "è" :

  • "é" : 0x65ffffffccffffff81 (which seems to be the UTF-8 code of "e" (0x65) and something else ... (0xffffffccffffff81).
  • "è" : 0x65ffffffccffffff80 (which seems to be the UTF-8 code of "e" (0x65) and something else ... (0xffffffccffffff80).

But normally, UTF-8 code of "é" is : 0xc3a9
and UTF-8 of "è" is : 0xc3a8

This difference is the reason why my C++ program (in Qt) doesn't correctly understand the word "pépè.png" ...

Have you the same observation ?
Have you got an explaination ?

Thanks for your help.

B.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Sorry, but i tested again and i'm getting the correct UTF-8 codes ( i tested with mingw and vs too ).
I'll need a minimal test where i can reproduce your problem. And, if it's possible without Qt, since i think you are having problems there, have you tested this with the efsw-test that comes with the project?

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


I'm gonna test with efsw-test.

Can you just give me the hexadecimal output of a file "pépè.png" detected by EFSW ?

Something like that :

#!c++

void print_hex(const char *s)
{
    while(*s)
    printf("%02x", (unsigned int) *s++);
}

[...]

switch (action)
{
case efsw::Actions::Add:
    std::cout << "DIR (" << dir << ") FILE (" << filename << "(";
    print_hex(filename.c_str());
    break;
[...]

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Here it is

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


OK, thank you.

Do you know why "é" and "è" chars are encoded on 14 bytes instead of 2 for others ?

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


No, that's not the encoding, i printed the data as you asked me, converting every char to unsigned int ( printf("%02x", (unsigned int) *s++); ), that's why you see those extra ffffff.
è first byte is: c3 and the second byte is a8.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


I think that your problem is that you're not converting correctly the UTF-8 std::string to QString, you need to create the string using QString::fromUtf8, and i think you are using QString( str.c_str() ).

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Oh no, now i see your previous post, you used QString::fromString. So i don't know, still if you want, make a minimal example of this failing, and i'll debug it ( use Qt4 if you want, because i think there's the problem ).

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


Hello,

It's really really strange.

As you advised me, I have changed the "test" sources of EFSW project :

src/test/efsw-test.cpp :

#!c++

[...]
void print_hex(const char *s)
{
      while (*s)
      printf("%02x", (unsigned int) *s++);
}

void handleFileAction( efsw::WatchID watchid, const std::string& dir, const std::string& filename, efsw::Action action, std::string oldFilename = ""  )
      {
      std::cout << "DIR (" << dir + ") FILE (" + ( oldFilename.empty() ? "" : "from file " + oldFilename + " to " ) + filename + " (";
      print_hex(filename.c_str());
      std::cout << ") " << ") has event " << getActionName( action ) << std::endl;
      }
[...]

As you can see, I've just added the "print_hex()" function. There is no worries about Qt ; indeed, I use your makefile to compile test program.

After compiling and executing, I get :

#!

iMac-de-B:bin bb$ ./efsw-test-release 
Press ^C to exit demo
CurPath: /Users/bb/Documents/efsw_test/efsw_project/bin/
Added WatchID: 1
Added WatchID: 2
DIR (/Users/buchet_b/Documents/efsw_test/efsw_project/bin/test/) FILE (from file pépé copie to pépé (7065ffffffccffffff817065ffffffccffffff81) ) has event Moved

So exactly the same ...

I really need your help. You'll find the EFSW project I use, here :

https://mega.co.nz/#!IQ0EDZZB!BAR8vwK8cnDWo05hpIJ_BhOkXgg0CaFNr0zsEPDMWYU

With these sources, what result do you have ?

Do you have any other idea ?

Thanks a lot by advance,

B.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Wait... your project file is from OS X, and i was testing on windows... so... your problems now are on OS X?
Give me some minutes and i'll test in OS X ( but i tested previously in this same thread and was working fine ).

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


I'm getting the correct code:

#!c++

DIR (/Users/charly/Downloads/efsw_project/bin/test/) FILE (from file pépé to pèpè (70ffffffc3ffffffa870ffffffc3ffffffa8) ) has event Moved

What i'm thinking is that your OS X file system is using a different encoding for file names.
I've read some articles about that, but i'm not sure how to handle it right now.
What i need you to do is:
run python from the terminal, insert:
import sys
print os. getfilesystemencoding()

(if you're on Mavericks and python crashes running this, fix it with the instructions from here: http://stackoverflow.com/questions/19569143/python3-segmentation-fault-on-osx-mavericks ).
And tell me what you get, it must be something different from utf-8.

It must be something similar to this problems:
https://bugzilla.mozilla.org/show_bug.cgi?id=703161
http://stackoverflow.com/questions/9757843/unicode-encoding-for-filesystem-in-mac-os-x-not-correct-in-python
http://apple.stackexchange.com/questions/10476/how-to-enter-special-characters-so-that-bash-terminal-understands-them

I'm a little bit busy to look for a fix right now, i'll need you to help me with this, or just wait a little bit for me to get some time to read about this. I don't event own a mac, so it's not that easy for me to see this.

Regards,
Martín

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


#!

iMac-de-B:efsw_test bb$ python
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys; import os; print sys.getfilesystemencoding()
utf-8

:(

Indeed, this link seems to be interesting :

http://stackoverflow.com/questions/9757843/unicode-encoding-for-filesystem-in-mac-os-x-not-correct-in-python

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


I think the problems comes from the file system encoding, please make a test converting the filename string from NFD to NFC, here's a function that i got from stackoverflow:

#!c++

std::string precomposeFilename(const std::string& name)
{
   CFStringRef cfStringRef = CFStringCreateWithCString(kCFAllocatorDefault, name.c_str(), kCFStringEncodingUTF8);
   CFMutableStringRef cfMutable = CFStringCreateMutableCopy(NULL, 0, cfStringRef);

   CFStringNormalize(cfMutable,kCFStringNormalizationFormC);

   char c_str[255 + 1];
   CFStringGetCString(cfMutable, c_str, sizeof(c_str)-1, kCFStringEncodingUTF8);

   CFRelease(cfStringRef);
   CFRelease(cfMutable);

   return std::string(c_str);
}

It seems to be a very common problem, but i'm not sure if we are dealing with this or is another thing.

Regards,
Martín

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Batte HUCHAI (Bitbucket: bhuchai, ).


It works. Thanks a lot for this last point.

from efsw.

SpartanJ avatar SpartanJ commented on July 17, 2024

Original comment by Martín Lucas Golini (Bitbucket: SpartanJ, GitHub: SpartanJ).


Excellent! I'm glad it worked... at last!
I made a commit with the corresponding changes. So, you'll not need to "patch" anything, just update the library.

from efsw.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.