azure / azure-kusto-python Goto Github PK
View Code? Open in Web Editor NEWKusto client libraries for Python
License: MIT License
Kusto client libraries for Python
License: MIT License
Currently KustoClient.execute* can return either str or KustoResponseDataSet according to get_raw_response.
Hey,
I'm running a query, and the method failed with an exception.
Code -
from azure.kusto.data.request import KustoClient, KustoConnectionStringBuilder, ClientRequestProperties
from azure.kusto.data.exceptions import KustoServiceError
from azure.kusto.data.helpers import dataframe_from_result_table
cluster = "https://clustername.kusto.windows.net"
db = "dbname"
kcsb = KustoConnectionStringBuilder.with_aad_device_authentication(cluster)
client = KustoClient(kcsb)
query = """
TableName
| where machine_id == "machine_id"
| where env_time > ago(1d)
| project env_time, message
"""
client.execute(db, query)
Exception -
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
c:\python364-32\lib\site-packages\azure\kusto\data\_models.py in __init__(self, columns, row)
63 else:
---> 64 typed_value = KustoResultRow.convertion_funcs[lower_column_type](value)
65 except (IndexError, AttributeError):
c:\python364-32\lib\site-packages\azure\kusto\data\_converters.py in to_datetime(value)
18 return parser.parse(value)
---> 19 return parser.isoparse(value)
20
AttributeError: module 'dateutil.parser' has no attribute 'isoparse'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
<ipython-input-12-bb7a6bdf3c0d> in <module>()
----> 1 client.execute(db, query)
c:\python364-32\lib\site-packages\azure\kusto\data\request.py in execute(self, database, query, properties)
275 if query.startswith("."):
276 return self.execute_mgmt(database, query, properties)
--> 277 return self.execute_query(database, query, properties)
278
279 def execute_query(self, database, query, properties=None):
c:\python364-32\lib\site-packages\azure\kusto\data\request.py in execute_query(self, database, query, properties)
285 :rtype: azure.kusto.data._response.KustoResponseDataSet
286 """
--> 287 return self._execute(self._query_endpoint, database, query, KustoClient._query_default_timeout, properties)
288
289 def execute_mgmt(self, database, query, properties=None):
c:\python364-32\lib\site-packages\azure\kusto\data\request.py in _execute(self, endpoint, database, query, default_timeout, properties)
320 if response.status_code == 200:
321 if endpoint.endswith("v2/rest/query"):
--> 322 return KustoResponseDataSetV2(response.json())
323 return KustoResponseDataSetV1(response.json())
324
c:\python364-32\lib\site-packages\azure\kusto\data\_response.py in __init__(self, json_response)
132
133 def __init__(self, json_response):
--> 134 super(KustoResponseDataSetV2, self).__init__([t for t in json_response if t["FrameType"] == "DataTable"])
c:\python364-32\lib\site-packages\azure\kusto\data\_response.py in __init__(self, json_response)
16
17 def __init__(self, json_response):
---> 18 self.tables = [KustoResultTable(t) for t in json_response]
19 self.tables_count = len(self.tables)
20 self.tables_names = [t.table_name for t in self.tables]
c:\python364-32\lib\site-packages\azure\kusto\data\_response.py in <listcomp>(.0)
16
17 def __init__(self, json_response):
---> 18 self.tables = [KustoResultTable(t) for t in json_response]
19 self.tables_count = len(self.tables)
20 self.tables_names = [t.table_name for t in self.tables]
c:\python364-32\lib\site-packages\azure\kusto\data\_models.py in __init__(self, json_table)
128 raise KustoServiceError(errors[0]["OneApiErrors"][0]["error"]["@message"], json_table)
129
--> 130 self.rows = [KustoResultRow(self.columns, row) for row in json_table["Rows"]]
131
132 @property
c:\python364-32\lib\site-packages\azure\kusto\data\_models.py in <listcomp>(.0)
128 raise KustoServiceError(errors[0]["OneApiErrors"][0]["error"]["@message"], json_table)
129
--> 130 self.rows = [KustoResultRow(self.columns, row) for row in json_table["Rows"]]
131
132 @property
c:\python364-32\lib\site-packages\azure\kusto\data\_models.py in __init__(self, columns, row)
64 typed_value = KustoResultRow.convertion_funcs[lower_column_type](value)
65 except (IndexError, AttributeError):
---> 66 typed_value = KustoResultRow.convertion_funcs[lower_column_type](value)
67 elif lower_column_type in KustoResultRow.convertion_funcs:
68 typed_value = KustoResultRow.convertion_funcs[lower_column_type](value)
c:\python364-32\lib\site-packages\azure\kusto\data\_converters.py in to_datetime(value)
17 if isinstance(value, six.integer_types):
18 return parser.parse(value)
---> 19 return parser.isoparse(value)
20
21
AttributeError: module 'dateutil.parser' has no attribute 'isoparse'
Machine/Python details -
CPython 3.6.4
azure-kusto-data==0.0.27
compiler : MSC v.1900 32 bit (Intel)
system : Windows
release : 10
machine : AMD64
interpreter: 32bit
Thanks
i wanted to map female as 1 and male as 1 in int data type, but getting 'Cannot convert non-finite values (NA or inf) to integer'.i have checked Null value and there is no NaN value in that column.
for dataset in combine:
dataset['Sex'] = dataset['Sex'].map( {'female': 1, 'male': 0} ).astype(int)
ValueError Traceback (most recent call last)
in
1 for dataset in combine:
----> 2 dataset['Sex']=dataset['Sex'].dropna(axis=0).map({'female':1, 'male':0}).astype(int)
3
4 train.head()
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors, **kwargs)
5689 # else, only a single dtype is given
5690 new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 5691 **kwargs)
5692 return self._constructor(new_data).finalize(self)
5693
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, **kwargs)
529
530 def astype(self, dtype, **kwargs):
--> 531 return self.apply('astype', dtype=dtype, **kwargs)
532
533 def convert(self, **kwargs):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
393 copy=align_copy)
394
--> 395 applied = getattr(b, f)(**kwargs)
396 result_blocks = _extend_blocks(applied, result_blocks)
397
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors, values, **kwargs)
532 def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
533 return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 534 **kwargs)
535
536 def _astype(self, dtype, copy=False, errors='raise', values=None,
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in _astype(self, dtype, copy, errors, values, **kwargs)
631
632 # _astype_nansafe works fine with 1-d only
--> 633 values = astype_nansafe(values.ravel(), dtype, copy=True)
634
635 # TODO(extension)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
674
675 if not np.isfinite(arr).all():
--> 676 raise ValueError('Cannot convert non-finite values (NA or inf) to '
677 'integer')
678
ValueError: Cannot convert non-finite values (NA or inf) to integer
The name collision between the instance attribute and method "next" in the KustoResultIter class is causing me to get "TypeError: 'int' object is not callable" whenever I try advancing my iterator instance with iter.next().
On a Raspberry Pi Zero W running out-of-the-box Raspbian Stretch, running
pip install azure-kusto-data
Tries to install pandas (even though the [pandas] flag was not used) and then hangs:
Downloading https://files.pythonhosted.org/packages/c5/db/e56e6b4bbac7c4a06de1c50de6fe1ef3810018ae11732a50f15f62c7d050/enum34-1.1.6-py2-none-any.whl
Collecting ipaddress (from cryptography>=1.1.0->adal>=1.0.0->azure-kusto-data)
Downloading https://files.pythonhosted.org/packages/fc/d0/7fc3a811e011d4b388be48a0e381db8d990042df54aa4ef4599a31d39853/ipaddress-1.0.22-py2.py3-none-any.whl
Collecting pycparser (from cffi!=1.11.3,>=1.7->cryptography>=1.1.0->adal>=1.0.0->azure-kusto-data)
Downloading https://www.piwheels.org/simple/pycparser/pycparser-2.18-py2.py3-none-any.whl (209kB)
100% |โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 215kB 253kB/s
Building wheels for collected packages: pandas, cryptography, numpy, cffi
Running setup.py bdist_wheel for pandas ... |
KustoClient today only supports client id with a key. We need to support client id with a certificate to address the scenarios for which the key doesn't meet the security bar.
there is a typo in the init function of KustoResultColumn
. parameter ordianl
should be ordinal
from azure.kusto.data.request import KustoClient, KustoConnectionStringBuilder
from azure.kusto.data.exceptions import KustoServiceError
kcsb = KustoConnectionStringBuilder.with_aad_application_key_authentication(
connection_string='',
aad_app_id='',
app_key='',
authority_id='')
client = KustoClient(kcsb)
db = 'db'
try:
query = "T"
response = client.execute(db, query)
except KustoServiceError as error:
response = None
print response.primary_results[0]
schema:
.create table T (a:string, b:decimal)
.ingest inline into table T
[,]
Code fails with error TypeError: Cannot convert None to Decimal
It's because of:
decimal
to this if
fixes problem
File "test.py", line 17, in <module>
response = client.execute(db, query)
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/azure/kusto/data/request.py", line 396, in execute
return self.execute_query(database, query, properties)
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/azure/kusto/data/request.py", line 407, in execute_query
self._query_endpoint, database, query, None, KustoClient._query_default_timeout, properties
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/azure/kusto/data/request.py", line 462, in _execute
return KustoResponseDataSetV2(response.json())
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/azure/kusto/data/_response.py", line 134, in __init__
super(KustoResponseDataSetV2, self).__init__([t for t in json_response if t["FrameType"] == "DataTable"])
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/azure/kusto/data/_response.py", line 18, in __init__
self.tables = [KustoResultTable(t) for t in json_response]
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/azure/kusto/data/_models.py", line 137, in __init__
self.rows = [KustoResultRow(self.columns, row) for row in json_table["Rows"]]
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/azure/kusto/data/_models.py", line 71, in __init__
typed_value = KustoResultRow.convertion_funcs[column_type](value)
File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/decimal.py", line 657, in __new__
raise TypeError("Cannot convert %r to Decimal" % value)
TypeError: Cannot convert None to Decimal
pip freeze
adal==1.2.2
asn1crypto==0.24.0
azure-kusto-data==0.0.31
azure-nspkg==3.0.2
certifi==2019.6.16
cffi==1.12.3
chardet==3.0.4
cryptography==2.7
enum34==1.1.6
idna==2.8
ipaddress==1.0.22
pbr==5.4.2
pycparser==2.19
PyJWT==1.7.1
python-dateutil==2.8.0
requests==2.22.0
six==1.12.0
stevedore==1.30.1
urllib3==1.25.3
virtualenv==16.7.2
virtualenv-clone==0.5.3
virtualenvwrapper==4.8.4
When using pandas, the dataframe_from_result_table call throws an error if it is passed a result set that has a blank result set.
#103 introduced this problem Lines 13 & 14 throw an error if the parameter evaluates to False. As empty result tables evaluate to false this results in any blank result set throwing a ValueError.
Working code:
response = client.execute("DB", "print 'a'")
return [dataframe_from_result_table(x) for x in response.primary_results]
Broken code:
response = client.execute("DB", "print 'a' | take 0")
return [dataframe_from_result_table(x) for x in response.primary_results]
fields = ["Name", "Metric", "Source"]
rows = [["p1", 23, "SDK"], ["p2", 25, "SDK"]]
df = pandas.DataFrame(data=rows, columns=fields)
ingestClient.ingest_from_dataframe(df, ingestion_properties=ingestion_props)
Total Sample available at below link
https://gist.github.com/prashanthmadi/738da1cf92f825eedf1451b985cee6e8
As you can see in below screenshot.. field's haven't matched during ingestion.. and i don't see p1, p2 data instead i see metrics in name field
[this step is to help pin point problems that are only specific to this platform.]
pip freeze
azure-kusto-data 0.0.35
azure-kusto-ingest 0.0.35
INGESTION_CLIENT = KustoIngestClient(KCSB_INGEST)
FILE_SIZE = 64158321 # in bytes
# All ingestion properties are documented here: https://docs.microsoft.com/azure/kusto/management/data-ingest#ingestion-properties
INGESTION_PROPERTIES = IngestionProperties(database=KUSTO_DATABASE, table=DESTINATION_TABLE, dataFormat=DataFormat.parquet,
mappingReference=DESTINATION_TABLE_COLUMN_MAPPING, additionalProperties={"creationTime":"2019-08-21"})
# FILE_SIZE is the raw size of the data in bytes
for BLOB_PATH in blob_list[0]:
BLOB_DESCRIPTOR = BlobDescriptor(BLOB_PATH, FILE_SIZE)
INGESTION_CLIENT.ingest_from_blob(
BLOB_DESCRIPTOR, ingestion_properties=INGESTION_PROPERTIES)
print('Done queuing up ingestion with Azure Data Explorer')
I set everything parquet but after running the ingestion, the library still thought I import csv. This is error from ADX
"Details": Mapping reference 'PARQUET_Mapping' of type 'csv' in database 'db01' could not be found.,
pip freeze
[paste the output of pip freeze
here below this line]
Installing Kusto client typically requires me to install or update Pandas or its upstream dependencies like NumPy even though only one method currently depends on it. Is it feasible to make Pandas an optional dependency?
Hello,
Why is random.choice used in KustoIngestClient
to select details from container and queues:
Is there really no better strategy than to pick at random?
It had been working fine till the recent change to use connection pool.
Error:
MaxRetryError: HTTPSConnectionPool(host='xxxx.uksouth.kusto.windows.net', port=443): Max retries exceeded with url: /v1/rest/mgmt (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f037fbb1b00>: Failed to establish a new connection: [Errno 111] Connection refused',))
I'm behind corporate proxy, using with_aad_application_key_authentication as authentication.
Receive the following error when attempting to process multiple local csv files:
Bad request: Request is invalid and cannot be executed. Entity ID '[DB NetDefaultDB v?.?]' of kind 'Database' was not found.
I'm passing my DB name as the database I'm connecting to so I searched the azure-kusto-ingest source and found that the first place that NetDefaultDB is used in the process is the _get_temp_storage_objects function:
def _get_temp_storage_objects(self):
response = self._kusto_client.execute_mgmt("NetDefaultDB", ".create tempstorage")
storages = list()
for row in response.iter_all():
storages.append(_ConnectionString.parse(row["StorageRoot"]))
return storages
The execute management function first variable is the DB. Since the "NetDefaultDB" is a string I'm not sure if this is this part of Kusto's inner working for ingestion and this error is permission related or if this value is supposed to be a variable related to the DB actually being accessed.
If query execution doesn't succeed kusto plugin fails to parse error as it's not in a json format, e.g.
`JSONDecodeError Traceback (most recent call last)
in
----> 1 drop_response = client.execute(db_aliases["Trouter Client PROD"], drop_command)
2 drop_response
~/anaconda3_501/lib/python3.6/site-packages/azure/kusto/data/request.py in execute(self, database, query, properties)
393 """
394 if query.startswith("."):
--> 395 return self.execute_mgmt(database, query, properties)
396 return self.execute_query(database, query, properties)
397
~/anaconda3_501/lib/python3.6/site-packages/azure/kusto/data/request.py in execute_mgmt(self, database, query, properties)
416 :rtype: azure.kusto.data._response.KustoResponseDataSet
417 """
--> 418 return self._execute(self._mgmt_endpoint, database, query, None, KustoClient._mgmt_default_timeout, properties)
419
420 def execute_streaming_ingest(self, database, table, stream, stream_format, properties=None, mapping_name=None):
~/anaconda3_501/lib/python3.6/site-packages/azure/kusto/data/request.py in _execute(self, endpoint, database, query, payload, timeout, properties)
463 return KustoResponseDataSetV1(response.json())
464
--> 465 raise KustoServiceError([response.json()], response)
466
467 def _get_timeout(self, properties, default):
~/anaconda3_501/lib/python3.6/site-packages/requests/models.py in json(self, **kwargs)
895 # used.
896 pass
--> 897 return complexjson.loads(self.text, **kwargs)
898
899 @Property
~/anaconda3_501/lib/python3.6/json/init.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
--> 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder
`
the following lambda fails to parse timespan correctly on when days and seconds co-exists.
ex> Kusto timespan: 7.04:44:01.5115511 (7 days, 4 hrs, 44 min, 1.5115511 seconds) will results in "7 days 04:44:01 days 5115511" string passed to pd.to_timedelta, which will produce "67 days 09:42:31" incorrectly. only the first dot should be replaced with "days"
if col_type.lower() == "timespan":
frame[col_name] = pandas.to_timedelta(
frame[col_name].apply(lambda t: t.replace(".", " days ") if t and "." in t.split(":")[0] else t)
)
@thisisnish commented on Fri Aug 23 2019
Of three python sdks listed here https://docs.microsoft.com/en-us/azure/kusto/api/python/kusto-python-client-library, none of them currently support .purge
operation. Is this correct or am I missing something? How can one send .purge
command using python sdk? Will this be incorporated in futire release of azure-kusto-mgmt
library?
@kaerm commented on Fri Aug 23 2019
Hi @thisisnish thanks for letting us know, I'm tagging relevant teams to help with this
@thisisnish commented on Tue Aug 27 2019
@kaerm any update or information on this?
@kaerm commented on Tue Aug 27 2019
@thisisnish working on finding the right team
I have a query that can be executed on Kusto.Explorer. When I tried to execute the same query and return a dataframe using dataframe = dataframe_from_result_table(response.primary_results[0])
, I always get this error.
File "c:/Users/yizhon/OneDrive - Microsoft/Azure IoT/Telemetry/Finance correlation/data-joining-scripts/data-merger.py", line 101, in ingestUsageDataFromKusto
dataframe = dataframe_from_result_table(response.primary_results[0])
File "C:\Python27\lib\site-packages\azure\kusto\data\helpers.py", line 56, in dataframe_from_result_table
frame[col_name] = frame[col_name].astype(pandas_type, errors="raise" if raise_errors else "ignore")
File "C:\Python27\lib\site-packages\pandas\util\_decorators.py", line 178, in wrapper
return func(*args, **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 5001, in astype
**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3714, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3581, in apply
applied = getattr(b, f)(**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 575, in astype
**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 664, in _astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "C:\Python27\lib\site-packages\pandas\core\dtypes\cast.py", line 702, in astype_nansafe
raise ValueError('Cannot convert non-finite values (NA or inf) to '
ValueError: Cannot convert non-finite values (NA or inf) to integer
Error arises when I tried to join certain tables - but some joins succeeded. Is there a way to stop the conversion to integer?
pip3 install setuptools --upgrade
pip3 install azure-kusto-data
TELEMETRY_ROOT=$SYSTEM_DEFAULTWORKINGDIRECTORY/telemetry
echo "Updating functions..."
for file in $TELEMETRY_ROOT/shared/*.csl; do
python3 $TELEMETRY_ROOT/deploy/scripts/execute_query.py $file
done
We're using the python version without pandas. However, the SDK seems to try and use Pandas, failing with the following:
Traceback (most recent call last):
2019-08-26T17:39:05.6853772Z File "/home/vsts/work/r1/a/telemetry/deploy/scripts/execute_query.py", line 6, in <module>
2019-08-26T17:39:05.6854664Z from azure.kusto.data.request import KustoClient, KustoConnectionStringBuilder, ClientRequestProperties
2019-08-26T17:39:05.6856179Z File "/home/vsts/.local/lib/python3.5/site-packages/azure/kusto/data/request.py", line 15, in <module>
2019-08-26T17:39:05.6856520Z from ._response import KustoResponseDataSetV1, KustoResponseDataSetV2
2019-08-26T17:39:05.6856881Z File "/home/vsts/.local/lib/python3.5/site-packages/azure/kusto/data/_response.py", line 10, in <module>
2019-08-26T17:39:05.6857175Z from ._models import KustoResultColumn, KustoResultRow, KustoResultTable, WellKnownDataSet
2019-08-26T17:39:05.6857598Z File "/home/vsts/.local/lib/python3.5/site-packages/azure/kusto/data/_models.py", line 30, in <module>
2019-08-26T17:39:05.6857655Z class KustoResultRow(object):
2019-08-26T17:39:05.6858359Z File "/home/vsts/.local/lib/python3.5/site-packages/azure/kusto/data/_models.py", line 35, in KustoResultRow
2019-08-26T17:39:05.6858558Z pandas_funcs = {"datetime": to_pandas_datetime, "timespan": to_pandas_timedelta}
2019-08-26T17:39:05.6858990Z NameError: name 'to_pandas_datetime' is not defined
pip freeze
I ran it from Azure devops so don't have PIP freeze handy. Since it only started in the last 1-2 days, I'm assuming this is a regression with azure-kusto-data 0.0.32
[paste the output of pip freeze
here below this line]
Application id look up under windows.net which fails the query execution
Get Token request returned http error: 400 and server response: {"error":"unauthorized_client","error_description":"AADSTS70001: Application with identifier '###' was not found in the directory windows.net
Hi! I read in this documentation that we can use this library in a Jupyter Notebook attached to a Spark cluster or Databricks.
I tried running from Pyspark3:
%config
!pip install azure-kusto-data==0.0.19
But received the error msg:
OSError: [Errno 13] Permission denied: '/usr/bin/anaconda/lib/python2.7/site-packages/dateutil/tzwin.py'
Do you know some possible reason?
Thank you,
Natalia
# Your code here
client = KustoIngestClient(kcsb)
i=0
for BLOB_PATH in blob_list:
if i%50==0:
print(str(i)+":"+BLOB_PATH)
ingestion_props = IngestionProperties(
database="HEB",
table="Orders",
dataFormat=DataFormat.PARQUET,
mappingReference ="ordersparquetmapping"
# incase status update for success are also required
# reportLevel=ReportLevel.FailuresAndSuccesses,
)
blob_descriptor = BlobDescriptor(blob_list[0], 100000) # 10 is the raw size of the data in bytes.
client.ingest_from_blob(blob_descriptor, ingestion_properties=ingestion_props)
i=i+1
print("Done queueing all files")
KustoServiceError: (KustoServiceError(...), [{'error': {'message': 'Request is invalid and cannot be executed.', '@type': 'Kusto.Data.Exceptions.SyntaxException', '@context': {'activityStack': '(Activity stack: CRID=KPC.execute;95c3c526-bc86-4103-957a-61bd9e71c72b ARID=1b4b19a9-9ea6-44f7-a31f-5c1fcbc5085f > DN.Admin.Client.ExecuteControlCommand/a2724130-abe8-46e7-90e7-501b58719343 > P.WCF.Service.ExecuteControlCommandInternal..IAdminClientServiceCommunicationContract/4d754776-c142-4f70-b327-f1eb2c19fb3c > DN.FE.ExecuteControlCommand/4986aaed-731c-4131-99f0-7905f36d4413)', 'appDomainName': 'Kusto.WinSvc.Svc.exe', 'processName': 'Kusto.WinSvc.Svc', 'processId': 3796, 'activityType': 'DN.FE.ExecuteControlCommand', 'timestamp': '2019-10-01T23:39:23.0328157Z', 'threadId': 6296, 'subActivityId': '4986aaed-731c-4131-99f0-7905f36d4413', 'clientRequestId': 'KPC.execute;95c3c526-bc86-4103-957a-61bd9e71c72b', 'machineName': 'KEngine000001', 'activityId': '1b4b19a9-9ea6-44f7-a31f-5c1fcbc5085f', 'parentActivityId': '4d754776-c142-4f70-b327-f1eb2c19fb3c', 'serviceAlias': 'ADXDEMO'}, '@message': "Syntax error: Query could not be parsed: . Query: '.get ingestion resources'", 'code': 'Bad request', '@permanent': True}}])
[this should explain why the current behavior is a problem and why the expected output is a better solution.]
[this step is to help pin point problems that are only specific to this platform.]
pip freeze
[paste the output of pip freeze
here below this line]
absl-py==0.8.0
adal==1.2.2
asn1crypto==0.24.0
astor==0.8.0
azure-common==1.1.23
azure-kusto-data==0.0.35
azure-kusto-ingest==0.0.35
azure-storage-blob==2.1.0
azure-storage-common==2.1.0
azure-storage-queue==2.1.0
certifi==2018.11.29
cffi==1.11.5
chardet==3.0.4
conda==4.5.12
cryptography==2.4.2
gast==0.3.0
google-pasta==0.1.7
grpcio==1.23.0
h5py==2.10.0
idna==2.8
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
Markdown==3.1.1
menuinst==1.4.14
numpy==1.17.2
opt-einsum==3.0.1
protobuf==3.9.1
py4j==0.10.7
pycosat==0.6.3
pycparser==2.19
PyJWT==1.7.1
pyOpenSSL==18.0.0
PySocks==1.6.8
pyspark==2.4.1
python-dateutil==2.8.0
pywin32==223
requests==2.21.0
ruamel-yaml==0.15.46
six==1.12.0
tb-nightly==1.15.0a20190806
termcolor==1.1.0
tf-estimator-nightly==1.14.0.dev2019080601
urllib3==1.24.1
Werkzeug==0.15.6
win-inet-pton==1.0.1
wincertstore==0.2
wrapt==1.11.2
When trying to authenticate with Kusto, I am getting the below error. Typically I manually go to a related site and force re-authentication with MFA, and that resolves the issue. However, that is not working also now. This error didn't exist for months, until the last month or two. Please suggest how to work around this.
AdalError: Get Token request returned http error: 400 and server response: {"error":"interaction_required","error_description":"AADSTS50079: The user is required to use multi-factor authentication.\r\nTrace ID: 8d4704cf-5c83-42bb-860b-7a2180d1fa00\r\nCorrelation ID: 87014a28-6bc1-4413-8592-ec1aac63646e\r\nTimestamp: 2018-12-31 20:57:33Z","error_codes":[50079],"timestamp":"2018-12-31 20:57:33Z","trace_id":"8d4704cf-5c83-42bb-860b-7a2180d1fa00","correlation_id":"87014a28-6bc1-4413-8592-ec1aac63646e","suberror":"basic_action"}
In KustoClient in the _execute
method, when the response.status_code
is not equal to 200
, the code throws an exception converting the response to json format. However in cases where the response is not convertible to json, the original error is masked and a JSONDecodeError
is thrown.
This should be fixed in such a way that the original error is not masked so that the user can fix their problem.
I have encountered this problem when my application didn't have access to the Kusto cluster. The original response status code was 403.
Trying out samples listed @ https://github.com/Azure/azure-kusto-python/blob/master/azure-kusto-data/tests/sample.py
This line is failing,
from azure.kusto.data.helpers import dataframe_from_result_table
error: ImportError: No module named helpers
I am trying this on pyspark prompt, have installed python modules.
Thanks
Hi, I am using internal AAD Federated cluster to access the cluste in Kusto explorer via being a part of security group. How would I go about adding that type of connection in the script ? Using my Azure Credentials does not seem to work.
Data Source=https://ar****d.kusto.windows.net:443;
Initial Catalog=NetDefaultDB;
AAD Federated Security=True
I normally use AAD-Federated to login on Kusto.Explorer. What's my tenant ID in this case? Trying to use this code:
# In case you want to authenticate with AAD username and password
username = "<username>"
password = "<password>"
kcsb = KustoConnectionStringBuilder.with_aad_user_password_authentication(cluster, username, password, authority_id)
When attempting to use the streaming ingest client in databricks, using either dataframe or stream, I am hitting "ValueError: Timeout value connect was 0:04:30, but it must be an int, float or None." which I believe is being thrown by requests:
/databricks/python/lib/python3.5/site-packages/azure/kusto/ingest/_streaming_ingest_client.py in ingest_from_dataframe(self, df, ingestion_properties)
51
52 ingestion_properties.format = DataFormat.csv
---> 53 self._ingest(fd.zipped_stream, fd.size, ingestion_properties, content_encoding="gzip")
54
55 fd.delete_files()
Looking through the code, I don't see an obvious way to set this value, can you point me to some documentation?
The str function in class KustoResultTable can not handle datetime objects.
Proposed solution: It would be better to have a default json serializable object in call to json.dumps. This would eliminate the need to have d["kind"] = d["kind"].value
def __str__(self):
d = self.to_dict()
return json.dumps(d, default=str)
Error trace:
Traceback (most recent call last):
File "util\kusto.py", line 126, in <module>
cpu = kusto_client.get_compute_cpu_utilization(start_time, end_time, tenant)
File "util\kusto.py", line 75, in get_compute_cpu_utilization
results = self._poll_and_execute(query, end_time)
File "util\kusto.py", line 51, in _poll_and_execute
logging.info("Polling kusto result {} min: {}".format(i, result))
File "C:\Users\shregup\AppData\Local\Programs\Python\Python37\lib\site-packages\azure\kusto\data\_models.py", line 170, in __str__
return json.dumps(d)
File "C:\Users\shregup\AppData\Local\Programs\Python\Python37\lib\json\__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "C:\Users\shregup\AppData\Local\Programs\Python\Python37\lib\json\encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "C:\Users\shregup\AppData\Local\Programs\Python\Python37\lib\json\encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "C:\Users\shregup\AppData\Local\Programs\Python\Python37\lib\json\encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type datetime is not JSON serializable
def create_df(data):
print('Constructing df: records = ' + str(len(data)))
dt = []
for idx, val in enumerate(data, 0):
dt.append([idx, json.dumps(val)])
fields = ['id', 'doc']
df = pd.DataFrame(data=dt, columns=fields)
return df
def send_data(data, db_name, table_name):
df = create_df(data)
df.to_csv(table_name + ".csv", index=False, encoding="utf-8", header=False)
ingestion_props = IngestionProperties(
database=db_name,
table=table_name,
dataFormat=DataFormat.csv,
# incase status update for success are also required
# reportLevel=ReportLevel.FailuresAndSuccesses,
)
failed = True #assume it will fail
print("------------------------------------------------------------------------")
print('Sending to kusto...')
while failed:
try:
ingestclient.ingest_from_dataframe(df, ingestion_properties=ingestion_props)
failed = False
except:
e = traceback.format_exc()
print(e)
failed = True
time.sleep(10)
print('Done sending')
return
r = QueryCosmosDbNoonsForClient(clientName)
send_data(r, dbname, 'noons')
r = GetAllPassagesFromPassageCache(clientName)
send_data(r, dbname, 'passages')
r = GetAlerts(clientName)
send_data(r, dbname, 'alerts')
r = GetVesselsWithModules(clientName)
send_data(r, dbname, 'vesselswithmodules')
r = GetClientList()
send_data(r, dbname, 'clients')
r = GetAllVesselGroupsForAllUsers(clientName)
send_data(r, dbname, 'vesselgroupsforusers')
r = GetAllVesselsForAClient(clientName)
send_data(r, dbname, 'vessels')
r in each case is just a list of JSON strings.
I get random errors when calling ingestclient.ingest_from_dataframe()
Example shows multiple calls to the above function and the above function traps for the exception and just retries. The retries are usually successful on the 1st or 2nd attempt. Here is example output:
Constructing df: records = 4610
------------------------------------------------------------------------
Sending to kusto...
Done sending
Constructing df: records = 16
------------------------------------------------------------------------
Sending to kusto...
Done sending
Constructing df: records = 287
------------------------------------------------------------------------
Sending to kusto...
------------------------------------------------------------------------
Sending to kusto...
ERROR:azure.storage.common.storageclient:Client-Request-ID=025eb4ca-e524-11e9-8b7e-70886b83b7da Retry policy did not allow for a retry: Server-Timestamp=Wed, 02 Oct 2019 14:50:42 GMT, Server-Request-ID=887414da-8003-000b-2830-7984e0000000, HTTP status code=400, Exception=The value for one of the HTTP headers is not in the correct format.<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidHeaderValue</Code><Message>The value for one of the HTTP headers is not in the correct format.RequestId:887414da-8003-000b-2830-7984e0000000Time:2019-10-02T14:50:43.4555251Z</Message><HeaderName>x-ms-version</HeaderName><HeaderValue>2019-02-02</HeaderValue></Error>.
Traceback (most recent call last):
File "c:\dev\src\i4\Services\i4ServicesPyParsing\KustoCache\__init__.py", line 108, in send_data
ingestclient.ingest_from_dataframe(df, ingestion_properties=ingestion_props)
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\kusto\ingest\_ingest_client.py", line 66, in ingest_from_dataframe
self.ingest_from_blob(BlobDescriptor(url, fd.size), ingestion_properties=ingestion_properties)
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\kusto\ingest\_ingest_client.py", line 121, in ingest_from_blob
queue_service.put_message(queue_name=queue_details.object_name, content=encoded)
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\queue\queueservice.py", line 793, in put_message
None, None, content])
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\storageclient.py", line 430, in _perform_request
raise ex
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\storageclient.py", line 358, in _perform_request
raise ex
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\storageclient.py", line 344, in _perform_request
HTTPError(response.status, response.message, response.headers, response.body))
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\_error.py", line 115, in _http_error_handler
raise ex
azure.common.AzureHttpError: The value for one of the HTTP headers is not in the correct format.
<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidHeaderValue</Code><Message>The value for one of the HTTP headers is not in the correct format.
RequestId:887414da-8003-000b-2830-7984e0000000
Time:2019-10-02T14:50:43.4555251Z</Message><HeaderName>x-ms-version</HeaderName><HeaderValue>2019-02-02</HeaderValue></Error>
Done sending
Constructing df: records = 4
------------------------------------------------------------------------
Sending to kusto...
Done sending
Constructing df: records = 20
------------------------------------------------------------------------
Sending to kusto...
Done sending
Constructing df: records = 16
------------------------------------------------------------------------
Sending to kusto...
Done sending
Constructing df: records = 2
------------------------------------------------------------------------
Sending to kusto...
ERROR:azure.storage.common.storageclient:Client-Request-ID=0d71a462-e524-11e9-8a6a-70886b83b7da Retry policy did not allow for a retry: Server-Timestamp=Wed, 02 Oct 2019 14:51:01 GMT, Server-Request-ID=467a7e1b-2003-0060-4b30-790314000000, HTTP status code=400, Exception=The value for one of the HTTP headers is not in the correct format.<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidHeaderValue</Code><Message>The value for one of the HTTP headers is not in the correct format.RequestId:467a7e1b-2003-0060-4b30-790314000000Time:2019-10-02T14:51:01.9837789Z</Message><HeaderName>x-ms-version</HeaderName><HeaderValue>2019-02-02</HeaderValue></Error>.
Traceback (most recent call last):
File "c:\dev\src\i4\Services\i4ServicesPyParsing\KustoCache\__init__.py", line 108, in send_data
ingestclient.ingest_from_dataframe(df, ingestion_properties=ingestion_props)
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\kusto\ingest\_ingest_client.py", line 66, in ingest_from_dataframe
self.ingest_from_blob(BlobDescriptor(url, fd.size), ingestion_properties=ingestion_properties)
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\kusto\ingest\_ingest_client.py", line 121, in ingest_from_blob
queue_service.put_message(queue_name=queue_details.object_name, content=encoded)
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\queue\queueservice.py", line 793, in put_message
None, None, content])
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\storageclient.py", line 430, in _perform_request
raise ex
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\storageclient.py", line 358, in _perform_request
raise ex
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\storageclient.py", line 344, in _perform_request
HTTPError(response.status, response.message, response.headers, response.body))
File "c:\dev\src\i4\Services\i4ServicesPyParsing\.env\lib\site-packages\azure\storage\common\_error.py", line 115, in _http_error_handler
raise ex
azure.common.AzureHttpError: The value for one of the HTTP headers is not in the correct format.
<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidHeaderValue</Code><Message>The value for one of the HTTP headers is not in the correct format.
RequestId:467a7e1b-2003-0060-4b30-790314000000
Time:2019-10-02T14:51:01.9837789Z</Message><HeaderName>x-ms-version</HeaderName><HeaderValue>2019-02-02</HeaderValue></Error>
Done sending
pip freeze
[paste the output of pip freeze
here below this line]
adal==1.2.1
antlr4-python3-runtime==4.7.2
applicationinsights==0.11.7
argcomplete==1.10.0
asn1crypto==0.24.0
astroid==2.1.0
atomicwrites==1.3.0
attrs==19.1.0
azure-batch==6.0.0
azure-cli==2.0.67
azure-cli-acr==2.2.9
azure-cli-acs==2.4.4
azure-cli-advisor==2.0.1
azure-cli-ams==0.4.7
azure-cli-appservice==0.2.21
azure-cli-backup==1.2.5
azure-cli-batch==4.0.3
azure-cli-batchai==0.4.10
azure-cli-billing==0.2.2
azure-cli-botservice==0.2.2
azure-cli-cdn==0.2.4
azure-cli-cloud==2.1.1
azure-cli-cognitiveservices==0.2.6
azure-cli-command-modules-nspkg==2.0.2
azure-cli-configure==2.0.24
azure-cli-consumption==0.4.4
azure-cli-container==0.3.18
azure-cli-core==2.0.67
azure-cli-cosmosdb==0.2.11
azure-cli-deploymentmanager==0.1.1
azure-cli-dla==0.2.6
azure-cli-dls==0.1.10
azure-cli-dms==0.1.4
azure-cli-eventgrid==0.2.4
azure-cli-eventhubs==0.3.7
azure-cli-extension==0.2.5
azure-cli-feedback==2.2.1
azure-cli-find==0.3.4
azure-cli-hdinsight==0.3.5
azure-cli-interactive==0.4.5
azure-cli-iot==0.3.11
azure-cli-iotcentral==0.1.7
azure-cli-keyvault==2.2.16
azure-cli-kusto==0.2.3
azure-cli-lab==0.1.8
azure-cli-maps==0.3.5
azure-cli-monitor==0.2.15
azure-cli-natgateway==0.1.1
azure-cli-network==2.5.2
azure-cli-nspkg==3.0.3
azure-cli-policyinsights==0.1.4
azure-cli-privatedns==1.0.2
azure-cli-profile==2.1.5
azure-cli-rdbms==0.3.12
azure-cli-redis==0.4.4
azure-cli-relay==0.1.5
azure-cli-reservations==0.4.3
azure-cli-resource==2.1.16
azure-cli-role==2.6.4
azure-cli-search==0.1.2
azure-cli-security==0.1.2
azure-cli-servicebus==0.3.6
azure-cli-servicefabric==0.1.20
azure-cli-signalr==1.0.1
azure-cli-sql==2.2.5
azure-cli-sqlvm==0.2.0
azure-cli-storage==2.4.3
azure-cli-telemetry==1.0.2
azure-cli-vm==2.2.23
azure-common==1.1.23
azure-cosmos==3.0.2
azure-datalake-store==0.0.39
azure-functions==1.0.0b5
azure-functions-devops-build==0.0.22
azure-functions-worker==1.0.0b10
azure-graphrbac==0.60.0
azure-keyvault==1.1.0
azure-kusto-data==0.0.33
azure-kusto-ingest==0.0.33
azure-mgmt-advisor==2.0.1
azure-mgmt-applicationinsights==0.1.1
azure-mgmt-authorization==0.50.0
azure-mgmt-batch==6.0.0
azure-mgmt-batchai==2.0.0
azure-mgmt-billing==0.2.0
azure-mgmt-botservice==0.2.0
azure-mgmt-cdn==3.1.0
azure-mgmt-cognitiveservices==3.0.0
azure-mgmt-compute==5.0.0
azure-mgmt-consumption==2.0.0
azure-mgmt-containerinstance==1.4.0
azure-mgmt-containerregistry==2.8.0
azure-mgmt-containerservice==5.2.0
azure-mgmt-cosmosdb==0.6.1
azure-mgmt-datalake-analytics==0.2.1
azure-mgmt-datalake-nspkg==3.0.1
azure-mgmt-datalake-store==0.5.0
azure-mgmt-datamigration==0.1.0
azure-mgmt-deploymentmanager==0.1.0
azure-mgmt-devtestlabs==2.2.0
azure-mgmt-dns==2.1.0
azure-mgmt-eventgrid==2.2.0
azure-mgmt-eventhub==2.6.0
azure-mgmt-hdinsight==0.2.1
azure-mgmt-imagebuilder==0.2.1
azure-mgmt-iotcentral==1.0.0
azure-mgmt-iothub==0.8.2
azure-mgmt-iothubprovisioningservices==0.2.0
azure-mgmt-keyvault==1.1.0
azure-mgmt-kusto==0.3.0
azure-mgmt-loganalytics==0.2.0
azure-mgmt-managementgroups==0.1.0
azure-mgmt-maps==0.1.0
azure-mgmt-marketplaceordering==0.1.0
azure-mgmt-media==1.1.1
azure-mgmt-monitor==0.5.2
azure-mgmt-msi==0.2.0
azure-mgmt-network==3.0.0
azure-mgmt-nspkg==3.0.2
azure-mgmt-policyinsights==0.3.1
azure-mgmt-privatedns==0.1.0
azure-mgmt-rdbms==1.8.0
azure-mgmt-recoveryservices==0.1.1
azure-mgmt-recoveryservicesbackup==0.1.2
azure-mgmt-redis==6.0.0
azure-mgmt-relay==0.1.0
azure-mgmt-reservations==0.3.1
azure-mgmt-resource==2.1.0
azure-mgmt-search==2.0.0
azure-mgmt-security==0.1.0
azure-mgmt-servicebus==0.6.0
azure-mgmt-servicefabric==0.2.0
azure-mgmt-signalr==0.1.1
azure-mgmt-sql==0.12.0
azure-mgmt-sqlvirtualmachine==0.3.0
azure-mgmt-storage==3.3.0
azure-mgmt-trafficmanager==0.51.0
azure-mgmt-web==0.42.0
azure-multiapi-storage==0.2.3
azure-nspkg==3.0.2
azure-storage-blob==1.3.1
azure-storage-common==1.4.2
azure-storage-nspkg==3.1.0
azure-storage-queue==2.1.0
bcrypt==3.1.7
certifi==2018.11.29
cffi==1.12.3
cftime==1.0.3.4
chardet==3.0.4
colorama==0.4.1
cryptography==2.7
DateTimeRange==0.5.5
fabric==2.4.0
geographiclib==1.49
grpcio==1.14.2
grpcio-tools==1.14.2
html2text==2018.1.9
humanfriendly==4.18
idna==2.8
importlib-metadata==0.20
invoke==1.2.0
ipaddress==1.0.22
isodate==0.6.0
isort==4.3.4
Jinja2==2.10.1
jmespath==0.9.4
knack==0.6.2
lazy-object-proxy==1.3.1
mail-parser==3.8.1
MarkupSafe==1.1.1
mbstrdecoder==0.7.0
mccabe==0.6.1
mock==3.0.5
more-itertools==7.2.0
msrest==0.6.8
msrestazure==0.6.1
netCDF4==1.4.2
numpy==1.16.0
oauthlib==3.0.1
packaging==19.1
pandas==0.24.0
paramiko==2.6.0
pluggy==0.12.0
portalocker==1.2.1
prompt-toolkit==1.0.16
protobuf==3.6.1
psutil==5.6.3
ptvsd==4.2.2
py==1.8.0
pycparser==2.19
Pygments==2.4.2
PyJWT==1.7.1
pylint==2.2.2
PyNaCl==1.3.0
pyOpenSSL==19.0.0
pyparsing==2.4.2
pyperclip==1.7.0
pypiwin32==223
pyreadline==2.1
pytest==5.1.2
python-dateutil==2.8.0
pytz==2018.9
pywin32==224
PyYAML==5.1.1
requests==2.21.0
requests-oauthlib==1.2.0
scipy==1.2.0
scp==0.13.2
simplejson==3.16.0
six==1.12.0
sshtunnel==0.1.5
tabulate==0.8.3
typed-ast==1.2.0
typepy==0.4.0
urllib3==1.24.1
vsts==0.1.25
vsts-cd-manager==1.0.2
wcwidth==0.1.7
websocket-client==0.56.0
wrapt==1.11.1
xarray==0.11.3
xlrd==1.2.0
xmltodict==0.12.0
zipp==0.6.0
I tried the kusto ingest code on Readme and it says DataFormat is not found....
It would be great if the SDK supported retrieving auth tokens from the Azure managed identity endpoint when running on Azure resources.
Hi,
Using following simple code:
KCSB_INGEST = KustoConnectionStringBuilder.with_aad_user_password_authentication(cluster, <u>, <p>)
ingestclient = KustoIngestClient(KCSB_INGEST)
dt = []
for idx, val in enumerate(r,0):
dt.append([idx, json.dumps(val)])
fields = ['id', 'doc']
df = pd.DataFrame(data=dt, columns=fields)
ingestion_props = IngestionProperties(
database=db_name,
table=table_name,
dataFormat=DataFormat.csv,
)
ingestclient.ingest_from_dataframe(df, ingestion_properties=ingestion_props)
r is just a list of strings.
I always get the following error:
ERROR:azure.storage.common.storageclient:Client-Request-ID=6cfd5cac-d9d2-11e9-a960-70886b83b7da Retry policy did not allow for a retry: Server-Timestamp=Wed, 18 Sep 2019 05:09:00 GMT, Server-Request-ID=7383a49b-e003-0040-7fdf-6d78b3000000, HTTP status code=400, Exception=The value for one of the HTTP headers is not in the correct format.InvalidHeaderValue
The value for one of the HTTP headers is not in the correct format.RequestId:7383a49b-e003-0040-7fdf-6d78b3000000Time:2019-09-18T05:09:00.6931679Zx-ms-version2019-02-02.
Traceback (most recent call last):
File "c:\dev\src\i4\Services\i4ServicesPyParsing\KustoCache_init_.py", line 69, in ingestnoons
ingestclient.ingest_from_dataframe(df, ingestion_properties=ingestion_props)
File "c:\dev\src\i4\Services\i4ServicesPyParsing.env\lib\site-packages\azure\kusto\ingest_ingest_client.py", line 66, in ingest_from_dataframe
self.ingest_from_blob(BlobDescriptor(url, fd.size), ingestion_properties=ingestion_properties)
File "c:\dev\src\i4\Services\i4ServicesPyParsing.env\lib\site-packages\azure\kusto\ingest_ingest_client.py", line 121, in ingest_from_blob
queue_service.put_message(queue_name=queue_details.object_name, content=encoded)
File "c:\dev\src\i4\Services\i4ServicesPyParsing.env\lib\site-packages\azure\storage\queue\queueservice.py", line 793, in put_message
None, None, content])
File "c:\dev\src\i4\Services\i4ServicesPyParsing.env\lib\site-packages\azure\storage\common\storageclient.py", line 430, in _perform_request
raise ex
File "c:\dev\src\i4\Services\i4ServicesPyParsing.env\lib\site-packages\azure\storage\common\storageclient.py", line 358, in _perform_request
raise ex
File "c:\dev\src\i4\Services\i4ServicesPyParsing.env\lib\site-packages\azure\storage\common\storageclient.py", line 344, in _perform_request
HTTPError(response.status, response.message, response.headers, response.body))
File "c:\dev\src\i4\Services\i4ServicesPyParsing.env\lib\site-packages\azure\storage\common_error.py", line 115, in _http_error_handler
I used the following to output to a file to look at the csv and it looks ok to me:
df.to_csv('foobar', index=False, encoding="utf-8", header=False, compression="gzip")
This seems like a permissions issue or like something is missing from setup like a blob storage.
What is missing or why the error?
Currently, importing KustoIngestClient
is from top level module, while KustoClient
isn't.
from azure.kusto.data import KustoClient, KustoConnectionStringBuilder
from azure.kusto.ingest import KustoIngestClient
Should consider a better design decision that will be consistent between the packages
Running the ".show diagnostics" command sometimes returns a value of "1-01-01 00:00:00" for the DataWarmingLastRunOn column. This value throws an exception when it is being converted to a python object:
Traceback (most recent call last):
File "/XXX/kusto_monitor.py", line 192, in
main()
File "/XXX/kusto_monitor.py", line 186, in main
process_cluster(cluster, args, output_file)
File "/XXX/kusto_monitor.py", line 141, in process_cluster
log_command_results(client, cluster_name, NO_DATABASE, ".show diagnostics", "diagnostics", output_file)
File "/XXX/kusto_monitor.py", line 93, in log_command_results
query_result = client.execute(database_name, command)
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/request.py", line 395, in execute
return self.execute_mgmt(database, query, properties)
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/request.py", line 418, in execute_mgmt
return self._execute(self._mgmt_endpoint, database, query, None, KustoClient._mgmt_default_timeout, properties)
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/request.py", line 463, in _execute
return KustoResponseDataSetV1(response.json())
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/_response.py", line 110, in init
super(KustoResponseDataSetV1, self).init(json_response["Tables"])
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/_response.py", line 18, in init
self.tables = [KustoResultTable(t) for t in json_response]
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/_models.py", line 137, in init
self.rows = [KustoResultRow(self.columns, row) for row in json_table["Rows"]]
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/_models.py", line 67, in init
self._hidden_values.append(to_pandas_datetime(value))
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/azure/kusto/data/helpers.py", line 8, in to_pandas_datetime
return pd.to_datetime(raw_value)
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/pandas/core/tools/datetimes.py", line 469, in to_datetime
result = _convert_listlike(np.array([arg]), box, format)[0]
File "/XXX/venv_healthcheck/local/lib/python2.7/site-packages/pandas/core/tools/datetimes.py", line 380, in _convert_listlike
raise e
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00
Under the "Contribute" title in readme.md, there is a hyperlink to Conributing.md file.
This file is missing.
When running the following, I get a cryptic TypeError: string indices must be integers
error. Either this is a bug on the Kusto side, or Kusto should provide a better error message.
Note: The below code works if I change execute_query
into execute
or execute_mgmt
from azure.kusto.data.request import KustoClient, KustoConnectionStringBuilder, ClientRequestProperties
from azure.kusto.data.exceptions import KustoServiceError
from azure.kusto.data.helpers import dataframe_from_result_table
cluster = "https://help.kusto.windows.net"
# In case you want to authenticate with AAD device code.
# Please note that if you choose this option, you'll need to autenticate for every new instance that is initialized.
# It is highly recommended to create one instance and use it for all of your queries.
kcsb = KustoConnectionStringBuilder.with_aad_device_authentication(cluster)
# The authentication method will be taken from the chosen KustoConnectionStringBuilder.
client = KustoClient(kcsb)
response = client.execute_query('Samples', '.show schema')
The internal docs for querying Kusto using Azure Notebooks appear to refer to a different API than is present in your README.md.
For example, The interal docs show using this:
from kusto_client import KustoClient
Whereas the README shows this:
from azure.kusto.data import KustoClient
Nowdays, I need to get data from one cluster and I need to use other cluster's function.But I can only use AAD to access one cluster at a time, How can I do?
Hi, I'm trying to import Kusto Client using Azure notebooks but I'm facing some issues.
First, I typed "!pip install azure-kusto-data" and it resulted:
Requirement already satisfied: azure-kusto-data in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (0.0.11)
Requirement already satisfied: adal>=1.0.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from azure-kusto-data) (1.2.0)
Requirement already satisfied: six>=1.10.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from azure-kusto-data) (1.11.0)
Requirement already satisfied: requests>=2.13.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from azure-kusto-data) (2.20.1)
Requirement already satisfied: pandas>=0.15.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from azure-kusto-data) (0.22.0)
Requirement already satisfied: azure-nspkg>=2.0.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from azure-kusto-data) (3.0.2)
Requirement already satisfied: python-dateutil>=2.6.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from azure-kusto-data) (2.7.5)
Requirement already satisfied: cryptography>=1.1.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from adal>=1.0.0->azure-kusto-data) (2.3.1)
Requirement already satisfied: PyJWT>=1.0.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from adal>=1.0.0->azure-kusto-data) (1.7.1)
Requirement already satisfied: certifi>=2017.4.17 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from requests>=2.13.0->azure-kusto-data) (2018.10.15)
Requirement already satisfied: idna<2.8,>=2.5 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from requests>=2.13.0->azure-kusto-data) (2.7)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from requests>=2.13.0->azure-kusto-data) (1.23)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from requests>=2.13.0->azure-kusto-data) (3.0.4)
Requirement already satisfied: pytz>=2011k in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from pandas>=0.15.0->azure-kusto-data) (2018.7)
Requirement already satisfied: numpy>=1.9.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from pandas>=0.15.0->azure-kusto-data) (1.14.6)
Requirement already satisfied: asn1crypto>=0.21.0 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from cryptography>=1.1.0->adal>=1.0.0->azure-kusto-data) (0.24.0)
Requirement already satisfied: cffi!=1.11.3,>=1.7 in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from cryptography>=1.1.0->adal>=1.0.0->azure-kusto-data) (1.11.5)
Requirement already satisfied: pycparser in /home/nbuser/anaconda3_501/lib/python3.6/site-packages (from cffi!=1.11.3,>=1.7->cryptography>=1.1.0->adal>=1.0.0->azure-kusto-data) (2.19)
In [7]:
Later, I typed "from azure.kusto.data import KustoClient", but got the result:
"ImportError: cannot import name 'KustoClient'"
Have you seen this error before or have a clue about the reason?
Thank you,
Natalia
for the azure-kusto-data sdk, there has no where to setup the login endpoin for AAD authentication. in Azure China, there has launch ADX, but can not connect using python SDK yet.
After investing work in #124,
and some internal discussions, we agreed to wait with this PR and reconsider changing the API to give better performance for both vanilla python and pandas use cases, and save some difficult trickery to allow parsing kusto type to dataframe:
Final api would look like
# result is of type KustoResultDataSet
result = client.execute(db, query)
# raw json
result.tables[0].json()
# iterator with lazy parsing of json
result.tables[0].rows()
# dataframe parsing from raw json
result.tables[0].to_dataframe()
This will cause some memory pressure, so a best practice would probably be:
# either explicitly access a specific table and drop the reference after conversion
df = client.execute(db, query).primary_results[0].to_dataframe()
# or, parse it all
dfs = client.execute(db, query).to_dataframes()
Feel free to add your thoughts, code will be implemented in next couple of weeks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.