Import csv file via ETL successfully but show only 1 record


#1

Hello,

I am trying to import .csv file with up to 100 rows which have successfully loaded. However, when I check via localhost:2480 there is only one record in the database.

I upload it via /oetl.sh as follow;

Sia-Macbook-Pro:~ sia$ cd /Users/sia/orientdb-3.0.14/bin ;

Sia-Macbook-Pro:bin sia$ ./oetl.sh /Users/sia/data/factory_test/csv/Pv20-60.json

OrientDB etl v.3.0.14 - Veloce (build ac128c5a9ba4c6dc9d25a32b92962d8863774423, branch 3.0.x) https://www.orientdb.com

2019-02-05 21:24:49:813 INFO Default limit of open files (512) will be used. [ONative]

2019-02-05 21:24:49:930 INFO 17179869184 B/16384 MB/16 GB of physical memory were detected on machine [ONative]

2019-02-05 21:24:49:931 INFO Detected memory limit for current process is 17179869184 B/16384 MB/16 GB [ONative]

2019-02-05 21:24:49:932 INFO JVM can use maximum 1963MB of heap memory [OMemoryAndLocalPaginatedEnginesInitializer]

2019-02-05 21:24:49:932 INFO Because OrientDB is running outside a container 12% of memory will be left unallocated according to the setting 'memory.leftToOS' not taking into account heap memory [OMemoryAndLocalPaginatedEnginesInitializer]

2019-02-05 21:24:49:934 INFO OrientDB auto-config DISKCACHE=12,454MB (heap=1,963MB os=16,384MB) [orientechnologies]

2019-02-05 21:24:49:934 INFO System is started under an effective user : `sia` [OEngineLocalPaginated]

2019-02-05 21:24:49:989 INFO WAL maximum segment size is set to 9,023 MB [OrientDBEmbedded]

2019-02-05 21:24:50:015 INFO BEGIN ETL PROCESSOR [OETLProcessor]

2019-02-05 21:24:50:016 INFO [file] Reading from file /Users/sia/data/factory_test/csv/Pv20-60.csv with encoding UTF-8 [OETLFileSource]

2019-02-05 21:24:50:016 INFO Started execution with 1 worker threads [OETLProcessor]

2019-02-05 21:24:50:084 INFO Page size for WAL located in /Users/sia/orientdb-3.0.14/bin/../databases/thaicompanies is set to 4096 bytes. [OCASDiskWriteAheadLog]

2019-02-05 21:24:50:312 INFO Storage 'plocal:/Users/sia/orientdb-3.0.14/bin/../databases//thaicompanies' is opened under OrientDB distribution : 3.0.14 - Veloce (build ac128c5a9ba4c6dc9d25a32b92962d8863774423, branch 3.0.x) [OLocalPaginatedStorage]

2019-02-05 21:24:51:017 INFO + extracted 54 rows (0 rows/sec) - 54 rows -> loaded 53 vertices (0 vertices/sec) Total time: 1001ms [0 warnings, 0 errors] [OETLProcessor]

2019-02-05 21:24:51:085 INFO END ETL PROCESSOR [OETLProcessor]

2019-02-05 21:24:51:086 INFO + extracted 100 rows (657 rows/sec) - 100 rows -> loaded 100 vertices (671 vertices/sec) Total time: 1071ms [0 warnings, 0 errors] [OETLSia-SiSia-MacbSia-MacbooSia-MaSia-MacbSia-MacbooSia-MacbSia-MacbSia-MacbSia-MaSia-MaSiSiSia-Sia-MacbSiSia-MaSia-SiSiSiSiSiSiSiSiSia-SiSiSiSiSiSia-Macbook-Pro:bin sia$

ETL config file that I used to upload is below

{
  "source": { "file": { "path": "/Users/sia/data/factory_test/csv/Pv20-60.csv"} },
  "extractor": { "csv": {} },
  "transformers": [
    { "vertex": { "class": "factory", "skipDuplicates": true } }
  ],
  "loader": {
    "orientdb": {
       "dbURL": "plocal:../databases/thaicompanies",
       "dbType": "graph",
       "classes": [
         {"name": "Category", "extends": "V"}
       ], "indexes": [
         {"class":"factory", "fields":["id:integer"], "type":"UNIQUE" }
       ]
    }
  }
}

How can I fix this? I’m non-IT background and very new to OrientDB, also at coding.
Thank you.


#2

I finally got this sorted out. The CSV file I used is CSV UTF-8 as I have content in Thai when I saveas just ‘Comma Separated Value’. It works well, however, I lost all data in Thai fonts. CSV file might not work well if text data is not in English, at least for my case.


#3

Hello, sia. Thanks for sharing. We wanted to follow up on your recent issue and learn more about the solution you found. Please let us know if you need any additional support.


#4

Orientdb does not support CSV UTF-8 upload. So, I have to transform my data into .json format.