序文:私はこのトピックに関する同様の質問を認識しており、それらの解決策を役に立たないように試みました。
FTPファイルからJSONファイルを取得し、それをPandasデータフレームに変換してから、それをMySQLに送信してテーブルに入力するPythonスクリプトを作成しています。ただし、Pandas.to_sqlで問題が発生しました。エラーは:
UnicodeEncodeError: 'ascii' codec can't encode character '\xd1' in position 87: ordinal not in range(128)
これが私のコードです:
import pandas as pd
import json
from sqlalchemy import create_engine
cnx = create_engine('mysql://username:[email protected]/database?charset=utf8',encoding='utf-8')
sl1 = pd.read_json('ftp://username:[email protected]/directory/sailings_cel.txt')
sl1 = pd.json_normalize(sl1["Dataset"])
sl1.to_sql(name='celsailingtemp',con=cnx,if_exists='append',index=True)
潜在的に役立つ情報:
この問題に関するガイダンスは大歓迎です!
編集:トレースバック:
Traceback (most recent call last):
File "c:/pythondir/debug.py", line 14, in <module>
sl1.to_sql(name='celsailingtemp',con=cnx,if_exists='append',index=True)
File "C:\pythondir\.venv\lib\site-packages\pandas\core\generic.py", line 2602, in to_sql
sql.to_sql(
File "C:\pythondir\.venv\lib\site-packages\pandas\io\sql.py", line 589, in to_sql
pandas_sql.to_sql(
File "C:\pythondir\.venv\lib\site-packages\pandas\io\sql.py", line 1398, in to_sql
table.insert(chunksize, method=method)
File "C:\pythondir\.venv\lib\site-packages\pandas\io\sql.py", line 830, in insert
exec_insert(conn, keys, chunk_iter)
File "C:\pythondir\.venv\lib\site-packages\pandas\io\sql.py", line 747, in _execute_insert
conn.execute(self.table.insert(), data)
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\engine\base.py", line 1014, in execute
return meth(self, multiparams, params)
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\sql\elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\engine\base.py", line 1127, in _execute_clauseelement
ret = self._execute_context(
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\engine\base.py", line 1317, in _execute_context
self._handle_dbapi_exception(
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\engine\base.py", line 1515, in _handle_dbapi_exception
util.raise_(exc_info[1], with_traceback=exc_info[2])
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\util\compat.py", line 178, in raise_
raise exception
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\engine\base.py", line 1257, in _execute_context
self.dialect.do_executemany(
File "C:\pythondir\.venv\lib\site-packages\sqlalchemy\dialects\mysql\mysqldb.py", line 148, in do_executemany
rowcount = cursor.executemany(statement, parameters)
File "C:\pythondir\.venv\lib\site-packages\MySQLdb\cursors.py", line 230, in executemany
return self._do_execute_many(
File "C:\pythondir\.venv\lib\site-packages\MySQLdb\cursors.py", line 255, in _do_execute_many
v = values % escape(next(args), conn)
File "C:\pythondir\.venv\lib\site-packages\MySQLdb\cursors.py", line 110, in _escape_args
ret = tuple(literal(ensure_bytes(arg)) for arg in args)
File "C:\pythondir\.venv\lib\site-packages\MySQLdb\cursors.py", line 110, in <genexpr>
ret = tuple(literal(ensure_bytes(arg)) for arg in args)
File "C:\pythondir\.venv\lib\site-packages\MySQLdb\connections.py", line 286, in literal
s = self._tuple_literal(o)
File "C:\pythondir\.venv\lib\site-packages\MySQLdb\connections.py", line 269, in _tuple_literal
return b"(%s)" % (b",".join(map(self.literal, t)))
File "C:\pythondir\.venv\lib\site-packages\MySQLdb\connections.py", line 288, in literal
s = self.escape(o, self.encoders)
UnicodeEncodeError: 'ascii' codec can't encode character '\xd1' in position 87: ordinal not in range(128)
問題のある文字を含むJSONスニペット:
{
"Table": "Cruise Sailing - Celebrity Cruises (CEL)",
"Number_of_CEL_sailings": 1032,
"Dataset": [
{
"SailingID": "117624",
"SailingPlanID": "70561",
"SailingPlanCode": "07G019",
"MarketName": "[United States, Canada, United Kingdom, Ireland, Australia, Spain, German, Portugal, Switzerland, New Zealand, United Arab Emirates, Mexico]",
"MarketNameList": [
"United States",
"Canada",
"United Kingdom",
"Ireland",
"Australia",
"Spain",
"German",
"Portugal",
"Switzerland",
"New Zealand",
"United Arab Emirates",
"Mexico"
],
"ShipID": "1373927",
"ShipName": "Celebrity Flora",
"DepartureDate": "2020-08-23",
"Duration": "7",
"PackageID": "FL07G019",
"SailingName": "7 NIGHT GALAPAGOS INNER LOOP",
"DestinationID": "615",
"DeparturePortCode": "BAL",
"DeparturePortID": "1203",
"ReturnPortCode": "BAL",
"ReturnPortID": "1203",
"PackageTypeID": "0",
"TourOnly": "no",
"Segment": "no",
"Itinerary": [
{
"DayNumber": 1,
"PortID": "1203",
"PortCode": "BAL",
"Description": "BALTRA (GALAPAGOS)",
"Language": "en_US",
"ArrivalTime": "--",
"DepartureTime": "2:30 PM"
},
{
"DayNumber": 1,
"PortID": "3781",
"PortCode": "DAP",
"Description": "DAPHNE ISLAND",
"Language": "en_US",
"ArrivalTime": "5:00 PM",
"DepartureTime": "6:00 PM"
},
{
"DayNumber": 2,
"PortID": "1212",
"PortCode": "RAB",
"Description": "RABIDA (GALAPAGOS)",
"Language": "en_US",
"ArrivalTime": "3:00 PM",
"DepartureTime": "7:00 PM"
},
{
"DayNumber": 2,
"PortID": "1211",
"PortCode": "PGA",
"Description": "EGAS PORT",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "12:00 PM"
},
{
"DayNumber": 2,
"PortID": "1212",
"PortCode": "RAB",
"Description": "RABIDA (GALAPAGOS)",
"Language": "en_US",
"ArrivalTime": "3:00 PM",
"DepartureTime": "7:00 PM"
},
{
"DayNumber": 3,
"PortID": "1656",
"PortCode": "EBI",
"Description": "ELIZABETH BAY",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "12:00 PM"
},
{
"DayNumber": 3,
"PortID": "1209",
"PortCode": "CET",
"Description": "TAGUS COVE (ISABELLA)",
"Language": "en_US",
"ArrivalTime": "3:00 PM",
"DepartureTime": "7:00 PM"
},
{
"DayNumber": 4,
"PortID": "1211",
"PortCode": "SLV",
"Description": "SULLIVAN BAY (SANTIAGO)",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "12:00 PM"
},
{
"DayNumber": 4,
"PortID": "1207",
"PortCode": "BAR",
"Description": "BARTOLOME ISLAND",
"Language": "en_US",
"ArrivalTime": "1:00 PM",
"DepartureTime": "7:00 PM"
},
{
"DayNumber": 5,
"PortID": "1208",
"PortCode": "LSB",
"Description": "LAS BACHAS - (SANTA CRUZ)",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "12:00 PM"
},
{
"DayNumber": 5,
"PortID": "1204",
"PortCode": "NSY",
"Description": "NORTH SEYMOUR ISLAND",
"Language": "en_US",
"ArrivalTime": "1:00 PM",
"DepartureTime": "9:30 PM"
},
{
"DayNumber": 6,
"PortID": "6167",
"PortCode": "SCC",
"Description": "PUERTO BAQUERIZO (CRISTOBAL)",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "12:00 PM"
},
{
"DayNumber": 6,
"PortID": "6168",
"PortCode": "PSC",
"Description": "PUNTA PITT (SAN CRISTOBAL)",
"Language": "en_US",
"ArrivalTime": "3:30 PM",
"DepartureTime": "6:00 PM"
},
{
"DayNumber": 7,
"PortID": "1536",
"PortCode": "PTA",
"Description": "PUERTO AYORA (SANTA CRUZ)",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "7:00 PM"
},
{
"DayNumber": 8,
"PortID": "1203",
"PortCode": "BAL",
"Description": "BALTRA (GALAPAGOS)",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "--"
}
]
},
{
"SailingID": "117625",
"SailingPlanID": "70560",
"SailingPlanCode": "07G020",
"MarketName": "[United States, Canada, United Kingdom, Ireland, Australia, Spain, German, Portugal, Switzerland, New Zealand, United Arab Emirates, Mexico]",
"MarketNameList": [
"United States",
"Canada",
"United Kingdom",
"Ireland",
"Australia",
"Spain",
"German",
"Portugal",
"Switzerland",
"New Zealand",
"United Arab Emirates",
"Mexico"
],
"ShipID": "1373927",
"ShipName": "Celebrity Flora",
"DepartureDate": "2020-08-30",
"Duration": "7",
"PackageID": "FL07G020",
"SailingName": "7 NIGHT GALAPAGOS OUTER LOOP",
"DestinationID": "615",
"DeparturePortCode": "BAL",
"DeparturePortID": "1203",
"ReturnPortCode": "BAL",
"ReturnPortID": "1203",
"PackageTypeID": "0",
"TourOnly": "no",
"Segment": "no",
"Itinerary": [
{
"DayNumber": 1,
"PortID": "1203",
"PortCode": "BAL",
"Description": "BALTRA (GALAPAGOS)",
"Language": "en_US",
"ArrivalTime": "--",
"DepartureTime": "2:30 PM"
},
{
"DayNumber": 1,
"PortID": "3781",
"PortCode": "DAP",
"Description": "DAPHNE ISLAND",
"Language": "en_US",
"ArrivalTime": "5:00 PM",
"DepartureTime": "6:00 PM"
},
{
"DayNumber": 2,
"PortID": "3020",
"PortCode": "GRB",
"Description": "GARDNER BAY (ESPAÑOLA)",
"Language": "en_US",
"ArrivalTime": "7:00 AM",
"DepartureTime": "2:00 PM"
}
]
}
]
}
celsailingtempのDDL:
CREATE TABLE `celsailingtemp` (
`index` int(11) NOT NULL,
`sailingplanid` int(11) DEFAULT NULL,
`sailingid` int(11) DEFAULT NULL,
`sailingplancode` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`marketname` json DEFAULT NULL,
`marketnamelist` json DEFAULT NULL,
`shipid` int(11) DEFAULT NULL,
`shipname` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`departuredate` date DEFAULT NULL,
`duration` int(11) DEFAULT NULL,
`packageid` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`sailingname` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`destinationid` int(11) DEFAULT NULL,
`departureportcode` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`departureportid` int(11) DEFAULT NULL,
`returnportcode` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`returnportid` int(11) DEFAULT NULL,
`packagetypeid` int(11) DEFAULT NULL,
`touronly` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`segment` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`itinerary` json DEFAULT NULL,
PRIMARY KEY (`index`),
UNIQUE KEY `index_UNIQUE` (`index`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
json_normalize level…ネストされた「itinerary」列を長い文字列に格下げします
実際には、そうではありません。列にはリスト(またはSeriesオブジェクト)が含まれており、DBAPIドライバーを混乱させていました。それを機能させるために私はこれをしなければなりませんでした:
sl1 = pd.read_json(r"C:\Users\Gord\Desktop\sailing.json")
sl1 = pd.json_normalize(sl1["Dataset"])
sl1["MarketName"] = sl1["MarketName"].map(lambda x: json.dumps(x))
sl1["MarketNameList"] = sl1["MarketNameList"].map(lambda x: json.dumps(x))
sl1["Itinerary"] = sl1["Itinerary"].map(lambda x: json.dumps(x))
sl1.to_sql(name="celsailingtemp", con=cnx, if_exists="append", index=True)
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加