lodstorage package¶
Submodules¶
lodstorage.csv module¶
-
class
lodstorage.csv.CSV(name)[source]¶ Bases:
lodstorage.lod.LODhelper for converting data in csv format to list of dicts (LoD) and vice versa
Constructor
-
static
fromCSV(csvString: str, fields: list = None, delimiter=', ', quoting=2, **kwargs)[source]¶ convert given csv string to list of dicts (LOD)
Parameters: - csvStr (str) – csv string that should be converted to LOD
- headerNames (list) – Names of the headers that should be used. If None it is assumed that the header is given.
Returns: list of dicts (LoD) containing the content of the given csv string
-
static
readFile(filename: str) → str[source]¶ Reads the given filename and returns it as string :param filename: Name of the file that should be returned as string
Returns: Content of the file as string
-
static
restoreFromCSVFile(filePath: str, headerNames: list = None, withPostfix: bool = False)[source]¶ restore LOD from given csv file
Parameters: - filePath (str) – file name
- headerNames (list) – Names of the headers that should be used. If None it is assumed that the header is given.
- withPostfix (bool) – If False the file type is appended to given filePath. Otherwise file type MUST be given with filePath.
Returns: list of dicts (LoD) containing the content of the given csv file
-
static
storeToCSVFile(lod: list, filePath: str, withPostfix: bool = False)[source]¶ converts the given lod to CSV file.
Parameters: - lod (list) – lod that should be converted to csv file
- filePath (str) – file name the csv should be stored to
- withPostfix (bool) – If False the file type is appended to given filePath. Otherwise file type MUST be given with filePath.
Returns: csv string of the given lod
-
static
toCSV(lod: list, includeFields: list = None, excludeFields: list = None, delimiter=', ', quoting=2, **kwargs)[source]¶ converts the given lod to CSV string. For details about the csv dialect parameters see https://docs.python.org/3/library/csv.html#csv-fmt-params
Parameters: - lod (list) – lod that should be converted to csv string
- includeFields (list) – list of fields that should be included in the csv (positive list)
- excludeFields (list) – list of fields that should be excluded from the csv (negative list)
- kwargs – csv dialect parameters
Returns: csv string of the given lod
-
static
lodstorage.entity module¶
Created on 2020-08-19
@author: wf
-
class
lodstorage.entity.EntityManager(name, entityName, entityPluralName: str, listName: str = None, clazz=None, tableName: str = None, primaryKey: str = None, config=None, handleInvalidListTypes=False, filterInvalidListTypes=False, listSeparator='⇹', debug=False)[source]¶ Bases:
lodstorage.yamlablemixin.YamlAbleMixin,lodstorage.jsonpicklemixin.JsonPickleMixin,lodstorage.jsonable.JSONAbleListgeneric entity manager
Constructor
Parameters: - name (string) – name of this eventManager
- entityName (string) – entityType to be managed e.g. Country
- entityPluralName (string) – plural of the the entityType e.g. Countries
- config (StorageConfig) – the configuration to be used if None a default configuration will be used
- handleInvalidListTypes (bool) – True if invalidListTypes should be converted or filtered
- filterInvalidListTypes (bool) – True if invalidListTypes should be deleted
- listSeparator (str) – the symbol to use as a list separator
- debug (boolean) – override debug setting when default of config is used via config=None
-
fromCache(force: bool = False, getListOfDicts=None, append=False, sampleRecordCount=-1)[source]¶ get my entries from the cache or from the callback provided
Parameters: - force (bool) – force ignoring the cache
- getListOfDicts (callable) – a function to call for getting the data
- append (bool) – True if records should be appended
- sampleRecordCount (int) – the number of records to analyze for type information
Returns: the list of Dicts and as a side effect setting self.cacheFile
-
fromStore(cacheFile=None, setList: bool = True) → list[source]¶ restore me from the store :param cacheFile: the cacheFile to use if None use the pre configured cachefile :type cacheFile: String :param setList: if True set my list with the data from the cache file :type setList: bool
Returns: list of dicts or JSON entitymanager Return type: list
-
getCacheFile(config=None, mode=<StoreMode.SQL: 3>)[source]¶ get the cache file for this event manager :param config: if None get the cache for my mode :type config: StorageConfig :param mode: the storeMode to use :type mode: StoreMode
-
getLoD()[source]¶ Return the LoD of the entities in the list
Returns: a list of Dicts Return type: list
-
getSQLDB(cacheFile)[source]¶ get the SQL database for the given cacheFile
Parameters: cacheFile (string) – the file to get the SQL db from
-
initSQLDB(sqldb, listOfDicts=None, withCreate: bool = True, withDrop: bool = True, sampleRecordCount=-1)[source]¶ initialize my sql DB
Parameters: - listOfDicts (list) – the list of dicts to analyze for type information
- withDrop (boolean) – true if the existing Table should be dropped
- withCreate (boolean) – true if the create Table command should be executed - false if only the entityInfo should be returned
- sampleRecordCount (int) – the number of records to analyze for type information
Returns: the entity information such as CREATE Table command
Return type:
-
setNone(record, fields)[source]¶ make sure the given fields in the given record are set to none :param record: the record to work on :type record: dict :param fields: the list of fields to set to None :type fields: list
-
showProgress(msg)[source]¶ display a progress message
Parameters: msg (string) – the message to display
-
store(limit=10000000, batchSize=250, append=False, fixNone=True, sampleRecordCount=-1, replace: bool = False) → str[source]¶ store my list of dicts
Parameters: - limit (int) – maximum number of records to store per batch
- batchSize (int) – size of batch for storing
- append (bool) – True if records should be appended
- fixNone (bool) – if True make sure the dicts are filled with None references for each record
- sampleRecordCount (int) – the number of records to analyze for type information
- replace (bool) – if True allow replace for insert
Returns: The cachefile being used
Return type: str
-
storeLoD(listOfDicts, limit=10000000, batchSize=250, cacheFile=None, append=False, fixNone=True, sampleRecordCount=1, replace: bool = False) → str[source]¶ store my entities
Parameters: - listOfDicts (list) – the list of dicts to store
- limit (int) – maximum number of records to store
- batchSize (int) – size of batch for storing
- cacheFile (string) – the name of the storage e.g path to JSON or sqlite3 file
- append (bool) – True if records should be appended
- fixNone (bool) – if True make sure the dicts are filled with None references for each record
- sampleRecordCount (int) – the number of records to analyze for type information
- replace (bool) – if True allow replace for insert
Returns: The cachefile being used
Return type: str
lodstorage.jsonable module¶
This module has a class JSONAble for serialization of tables/list of dicts to and from JSON encoding
Created on 2020-09-03
@author: wf
-
class
lodstorage.jsonable.JSONAble[source]¶ Bases:
objectmixin to allow classes to be JSON serializable see
Constructor
-
asJSON(asString=True, data=None)[source]¶ recursively return my dict elements
Parameters: asString (boolean) – if True return my result as a string
-
checkExtension(jsonFile: str, extension: str = '.json') → str[source]¶ make sure the jsonFile has the given extension e.g. “.json”
Parameters: jsonFile (str) – the jsonFile name - potentially without “.json” suffix Returns: the jsonFile name with “.json” as an extension guaranteed Return type: str
-
fromDict(data: dict)[source]¶ initialize me from the given data
Parameters: data (dict) – the dictionary to initialize me from
-
fromJson(jsonStr)[source]¶ initialize me from the given JSON string
Parameters: jsonStr (str) – the JSON string
-
getJSONValue(v)[source]¶ get the value of the given v as JSON
Parameters: v (object) – the value to get Returns: the the value making sure objects are return as dicts
-
static
getJsonTypeSamplesForClass(cls)[source]¶ return the type samples for the given class
Returns: a list of dict that specify the types by example Return type: list
-
static
readJsonFromFile(jsonFilePath)[source]¶ read json string from the given jsonFilePath
Parameters: jsonFilePath (string) – the path of the file where to read the result from Returns: the JSON string read from the file
-
reprDict(srcDict)[source]¶ get the given srcDict as new dict with fields being converted with getJSONValue
Parameters: scrcDict (dict) – the source dictionary - Returns
- dict: the converted dictionary
-
restoreFromJsonFile(jsonFile: str)[source]¶ restore me from the given jsonFile
Parameters: jsonFile (string) – the jsonFile to restore me from
-
static
singleQuoteToDoubleQuote(singleQuoted, useRegex=False)[source]¶ convert a single quoted string to a double quoted one
Parameters: - singleQuoted (str) –
a single quoted string e.g.
{‘cities’: [{‘name’: “Upper Hell’s Gate”}]}
- useRegex (boolean) – True if a regular expression shall be used for matching
Returns: the double quoted version of the string
Return type: string
- singleQuoted (str) –
-
static
singleQuoteToDoubleQuoteUsingBracketLoop(singleQuoted)[source]¶ convert a single quoted string to a double quoted one using a regular expression
Parameters: - singleQuoted (string) – a single quoted string e.g. {‘cities’: [{‘name’: “Upper Hell’s Gate”}]}
- useRegex (boolean) – True if a regular expression shall be used for matching
Returns: the double quoted version of the string e.g.
Return type: string
-
static
singleQuoteToDoubleQuoteUsingRegex(singleQuoted)[source]¶ convert a single quoted string to a double quoted one using a regular expression
Parameters: - singleQuoted (string) – a single quoted string e.g. {‘cities’: [{‘name’: “Upper Hell’s Gate”}]}
- useRegex (boolean) – True if a regular expression shall be used for matching
Returns: the double quoted version of the string e.g.
Return type: string
-
static
storeJsonToFile(jsonStr, jsonFilePath)[source]¶ store the given json string to the given jsonFilePath
Parameters: - jsonStr (string) – the string to store
- jsonFilePath (string) – the path of the file where to store the result
-
storeToJsonFile(jsonFile: str, extension: str = '.json', limitToSampleFields: bool = False)[source]¶ store me to the given jsonFile
Parameters: - jsonFile (str) – the JSON file name (optionally without extension)
- exension (str) – the extension to use if not part of the jsonFile name
- limitToSampleFields (bool) – If True the returned JSON is limited to the attributes/fields that are present in the samples. Otherwise all attributes of the object will be included. Default is False.
-
toJSON(limitToSampleFields: bool = False)[source]¶ Parameters: limitToSampleFields (bool) – If True the returned JSON is limited to the attributes/fields that are present in the samples. Otherwise all attributes of the object will be included. Default is False. Returns: a recursive JSON dump of the dicts of my objects
-
-
class
lodstorage.jsonable.JSONAbleList(listName: str = None, clazz=None, tableName: str = None, initList: bool = True, handleInvalidListTypes=False, filterInvalidListTypes=False)[source]¶ Bases:
lodstorage.jsonable.JSONAbleContainer class
Constructor
Parameters: - listName (str) – the name of the list attribute to be used for storing the List
- clazz (class) – a class to be used for Object relational mapping (if any)
- tableName (str) – the name of the “table” to be used
- initList (bool) – True if the list should be initialized
- handleInvalidListTypes (bool) – True if invalidListTypes should be converted or filtered
- filterInvalidListTypes (bool) – True if invalidListTypes should be deleted
-
asJSON(asString=True)[source]¶ recursively return my dict elements
Parameters: asString (boolean) – if True return my result as a string
-
fromJson(jsonStr, types=None)[source]¶ initialize me from the given JSON string
Parameters: - jsonStr (str) – the JSON string
- fixType (Types) – the types to be fixed
-
fromLoD(lod, append: bool = True, debug: bool = False)[source]¶ load my entityList from the given list of dicts
Parameters: - lod (list) – the list of dicts to load
- append (bool) – if True append to my existing entries
Returns: a list of errors (if any)
Return type: list
-
getLoDfromJson(jsonStr: str, types=None, listName: str = None)[source]¶ get a list of Dicts form the given JSON String
Parameters: - jsonStr (str) – the JSON string
- fixType (Types) – the types to be fixed
Returns: a list of dicts
Return type: list
-
getLookup(attrName: str, withDuplicates: bool = False)[source]¶ create a lookup dictionary by the given attribute name
Parameters: - attrName (str) – the attribute to lookup
- withDuplicates (bool) – whether to retain single values or lists
Returns: a dictionary for lookup or a tuple dictionary,list of duplicates depending on withDuplicates
-
readLodFromJsonFile(jsonFile: str, extension: str = '.json')[source]¶ read the list of dicts from the given jsonFile
Parameters: jsonFile (string) – the jsonFile to read from Returns: a list of dicts Return type: list
-
readLodFromJsonStr(jsonStr) → list[source]¶ restore me from the given jsonStr
Parameters: storeFilePrefix (string) – the prefix for the JSON file name
-
restoreFromJsonStr(jsonStr: str) → list[source]¶ restore me from the given jsonStr
Parameters: jsonStr (str) – the json string to restore me from
-
class
lodstorage.jsonable.JSONAbleSettings[source]¶ Bases:
objectsettings for JSONAble - put in a separate class so they would not be serialized
-
indent= 4¶ regular expression to be used for conversion from singleQuote to doubleQuote see https://stackoverflow.com/a/50257217/1497139
-
singleQuoteRegex= re.compile("(?<!\\\\)'")¶
-
-
class
lodstorage.jsonable.Types(name: str, warnOnUnsupportedTypes=True, debug=False)[source]¶ Bases:
lodstorage.jsonable.JSONAbleholds entity meta Info
Variables: name(string) – entity name = table name Constructor
Parameters: - name (str) – the name of the type map
- warnOnUnsupportedTypes (bool) – if TRUE warn if an item value has an unsupported type
- debug (bool) – if True - debugging information should be shown
-
addType(listName, field, valueType)[source]¶ add the python type for the given field to the typeMap
Parameters: - listName (string) – the name of the list of the field
- field (string) – the name of the field
- valueType (type) – the python type of the field
-
fixTypes(lod: list, listName: str)[source]¶ fix the types in the given data structure
Parameters: - lod (list) – a list of dicts
- listName (str) – the types to lookup by list name
-
static
forTable(instance, listName: str, warnOnUnsupportedTypes: bool = True, debug=False)[source]¶ get the types for the list of Dicts (table) in the given instance with the given listName :param instance: the instance to inspect :type instance: object :param listName: the list of dicts to inspect :type listName: string :param warnOnUnsupportedTypes: if TRUE warn if an item value has an unsupported type :type warnOnUnsupportedTypes: bool :param debug: True if debuggin information should be shown :type debug: bool
Returns: a types object Return type: Types
-
getTypes(listName: str, sampleRecords: list, limit: int = 10)[source]¶ determine the types for the given sample records
Parameters: - listName (str) – the name of the list
- sampleRecords (list) – a list of items
- limit (int) – the maximum number of items to check
-
getTypesForItems(listName: str, items: list, warnOnNone: bool = False)[source]¶ get the types for the given items side effect is setting my types
Parameters: - listName (str) – the name of the list
- items (list) – a list of items
- warnOnNone (bool) – if TRUE warn if an item value is None
-
typeName2Type= {'bool': <class 'bool'>, 'date': <class 'datetime.date'>, 'datetime': <class 'datetime.datetime'>, 'float': <class 'float'>, 'int': <class 'int'>, 'str': <class 'str'>}¶
lodstorage.jsonpicklemixin module¶
-
class
lodstorage.jsonpicklemixin.JsonPickleMixin[source]¶ Bases:
objectallow reading and writing derived objects from a jsonpickle file
-
asJsonPickle() → str[source]¶ convert me to JSON
Returns: a JSON String with my JSON representation Return type: str
-
static
checkExtension(jsonFile: str, extension: str = '.json') → str[source]¶ make sure the jsonFile has the given extension e.g. “.json”
Parameters: jsonFile (str) – the jsonFile name - potentially without “.json” suffix Returns: the jsonFile name with “.json” as an extension guaranteed Return type: str
-
debug= False¶
-
lodstorage.lod module¶
Created on 2021-01-31
@author: wf
-
class
lodstorage.lod.LOD(name)[source]¶ Bases:
objectlist of Dict aka Table
Constructor
-
static
addLookup(lookup, duplicates, record, value, withDuplicates: bool)[source]¶ add a single lookup result
Parameters: - lookup (dict) – the lookup map
- duplicates (list) – the list of duplicates
- record (dict) – the current record
- value (object) – the current value to lookup
- withDuplicates (bool) – if True duplicates should be allowed and lists returned if False a separate duplicates
- is created (list) –
-
static
filterFields(lod: list, fields: list, reverse: bool = False)[source]¶ filter the given LoD with the given list of fields by either limiting the LoD to the fields or removing the fields contained in the list depending on the state of the reverse parameter
Parameters: - lod (list) – list of dicts from which the fields should be excluded
- fields (list) – list of fields that should be excluded from the lod
- reverse (bool) – If True limit dict to the list of given fields. Otherwise exclude the fields from the dict.
Returns: LoD
-
static
getLookup(lod: list, attrName: str, withDuplicates: bool = False)[source]¶ create a lookup dictionary by the given attribute name for the given list of dicts
Parameters: - lod (list) – the list of dicts to get the lookup dictionary for
- attrName (str) – the attribute to lookup
- withDuplicates (bool) – whether to retain single values or lists
Returns: a dictionary for lookup
-
classmethod
handleListTypes(lod, doFilter=False, separator=', ')[source]¶ handle list types in the given list of dicts
Parameters: - cls – this class
- lod (list) – a list of dicts
- doFilter (bool) – True if records containing lists value items should be filtered
- separator (str) – the separator to use when converting lists
-
static
intersect(listOfDict1, listOfDict2, key=None)[source]¶ get the intersection of the two lists of Dicts by the given key
-
static
setNone(record, fields)[source]¶ make sure the given fields in the given record are set to none :param record: the record to work on :type record: dict :param fields: the list of fields to set to None :type fields: list
-
static
lodstorage.mwTable module¶
Created on 2020-08-21
@author: wf
-
class
lodstorage.mwTable.MediaWikiTable(wikiTable=True, colFormats=None, sortable=True, withNewLines=False)[source]¶ Bases:
objecthelper for https://www.mediawiki.org/wiki/Help:Tables
Constructor
lodstorage.plot module¶
Created on 2020-07-05
@author: wf
-
class
lodstorage.plot.Plot(valueList, title, xlabel=None, ylabel=None, gformat='.png', fontsize=12, plotdir=None, debug=False)[source]¶ Bases:
objectcreate Plot based on counters see https://stackoverflow.com/questions/19198920/using-counter-in-python-to-build-histogram
Constructor
lodstorage.query module¶
Created on 2020-08-22
@author: wf
-
class
lodstorage.query.Endpoint[source]¶ Bases:
lodstorage.jsonable.JSONAblea query endpoint
constructor for setting defaults
-
class
lodstorage.query.EndpointManager[source]¶ Bases:
objectmanages a set of SPARQL endpoints
-
class
lodstorage.query.Format[source]¶ Bases:
enum.Enumthe supported formats for the results to be delivered
-
csv= 'csv'¶
-
github= 'github'¶
-
json= 'json'¶
-
latex= 'latex'¶
-
mediawiki= 'mediawiki'¶
-
tsv= 'tsv'¶
-
xml= 'xml'¶
-
-
class
lodstorage.query.Query(name: str, query: str, lang='sparql', endpoint: str = None, database: str = 'blazegraph', title: str = None, description: str = None, limit: int = None, prefixes=None, tryItUrl: str = None, formats: list = None, debug=False)[source]¶ Bases:
objecta Query e.g. for SPAQRL
constructor :param name: the name/label of the query :type name: string :param query: the native Query text e.g. in SPARQL :type query: string :param lang: the language of the query e.g. SPARQL :type lang: string :param endpoint: the endpoint url to use :type endpoint: string :param database: the type of database e.g. “blazegraph” :type database: string :param title: the header/title of the query :type title: string :param description: the description of the query :type description: string :param limit: the limit of the query default: None :type limit: int :param prefixes: list of prefixes to be resolved :type prefixes: list :param tryItUrl: the url of a “tryit” webpage :type tryItUrl: str :param formats: key,value pairs of ValueFormatters to be applied :type formats: list :param debug: true if debug mode should be switched on :type debug: boolean
-
asWikiMarkup(listOfDicts)[source]¶ convert the given listOfDicts result to MediaWiki markup
Parameters: listOfDicts (list) – the list of Dicts to convert to MediaWiki markup Returns: the markup Return type: string
-
asWikiSourceMarkup()[source]¶ convert me to Mediawiki markup for syntax highlighting using the “source” tag
Returns: the Markup Return type: string
-
documentQueryResult(qlod: list, limit=None, tablefmt: str = 'mediawiki', tryItUrl: str = None, withSourceCode=True, **kwArgs)[source]¶ document the given query results - note that a copy of the whole list is going to be created for being able to format
Parameters: - qlod – the list of dicts result
- limit (int) – the maximum number of records to display in result tabulate
- tablefmt (str) – the table format to use
- tryItUrl – the “try it!” url to show
- withSourceCode (bool) – if True document the source code
Returns: the documentation tabular text for the given parameters
Return type: str
-
formatWithValueFormatters(lod, tablefmt: str)[source]¶ format the given list of Dicts with the ValueFormatters
-
getLink(url, title, tablefmt)[source]¶ convert the given url and title to a link for the given tablefmt
Parameters: - url (str) – the url to convert
- title (str) – the title to show
- tablefmt (str) – the table format to use
-
getTryItUrl(baseurl: str, database: str = 'blazegraph')[source]¶ return the “try it!” url for the given baseurl
Parameters: baseurl (str) – the baseurl to used Returns: the “try it!” url for the given query Return type: str
-
preFormatWithCallBacks(lod, tablefmt: str)[source]¶ run the configured call backs to pre-format the given list of dicts for the given tableformat
Parameters: - lod (list) – the list of dicts to handle
- tablefmt (str) – the table format (according to tabulate) to apply
-
prefixToLink(lod: list, prefix: str, tablefmt: str)[source]¶ convert url prefixes to link according to the given table format TODO - refactor as preFormat callback
Parameters: - lod (list) – the list of dicts to convert
- prefix (str) – the prefix to strip
- tablefmt (str) – the tabulate tableformat to use
-
-
class
lodstorage.query.QueryManager(lang: str = None, debug=False, queriesPath=None)[source]¶ Bases:
objectmanages pre packaged Queries
Constructor :param lang: the language to use for the queries sql or sparql :type lang: string :param debug: True if debug information should be shown :type debug: boolean
-
class
lodstorage.query.QueryResultDocumentation(query, title: str, tablefmt: str, tryItMarkup: str, sourceCodeHeader: str, sourceCode: str, resultHeader: str, result: str)[source]¶ Bases:
objectdocumentation of a query result
constructor
Parameters: - query (Query) – the query to be documented
- title (str) – the title markup
- tablefmt (str) – the tableformat that has been used
- tryItMarkup – the “try it!” markup to show
- sourceCodeHeader (str) – the header title to use for the sourceCode
- sourceCode (str) – the sourceCode
- resultCodeHeader (str) – the header title to use for the result
- result (str) – the result header
-
asText()[source]¶ return my text representation
Returns: description, sourceCodeHeader, sourceCode, tryIt link and result table Return type: str
-
static
uniCode2Latex(text: str, withConvert: bool = False) → str[source]¶ converts unicode text to latex and fixes UTF-8 chars for latex in a certain range:
₀:$_0$ … ₉:$_9$see https://github.com/phfaist/pylatexenc/issues/72
Parameters: - text (str) – the string to fix
- withConvert (bool) – if unicode to latex libary conversion should be used
Returns: latex presentation of UTF-8 char
Return type: str
-
class
lodstorage.query.QuerySyntaxHighlight(query, highlightFormat: str = 'html')[source]¶ Bases:
objectSyntax highlighting for queries with pygments
construct me for the given query and highlightFormat
Parameters: - query (Query) – the query to do the syntax highlighting for
- highlightFormat (str) – the highlight format to be used
-
class
lodstorage.query.ValueFormatter(name: str, formatString: str, regexps: list = None)[source]¶ Bases:
objecta value Formatter
constructor
Parameters: - fstring (str) – the format String to use
- regexps (list) – the regular expressions to apply
-
applyFormat(record, key, resultFormat: lodstorage.query.Format)[source]¶ apply the given format to the given record
Parameters: - record (dict) – the record to handle
- key (str) – the property key
- resultFormat (str) – the resultFormat Style to apply
-
formatsPath= '/home/docs/checkouts/readthedocs.org/user_builds/pylodstorage/checkouts/stable/lodstorage/../sampledata/formats.yaml'¶
-
classmethod
getFormats(formatsPath: str = None) → dict[source]¶ get the available ValueFormatters
Parameters: formatsPath (str) – the path to the yaml file to read the format specs from Returns: a map for ValueFormatters by formatter Name Return type: dict
-
home= '/home/docs'¶
-
valueFormats= None¶
lodstorage.sample module¶
Created on 2020-08-24
@author: wf
-
class
lodstorage.sample.Royal[source]¶ Bases:
lodstorage.jsonable.JSONAblei am a single Royal
Constructor
-
class
lodstorage.sample.Royals(load=False)[source]¶ Bases:
lodstorage.jsonable.JSONAbleLista non ORM Royals list
lodstorage.schema module¶
Created on 2021-01-26
@author: wf
-
class
lodstorage.schema.Schema(name: str, title: str)[source]¶ Bases:
objecta relational Schema
Constructor
Parameters: - name (str) – the name of the schema
- title (str) – the title of the schema
-
static
generalizeColumn(tableList, colName: str)[source]¶ remove the column with the given name from all tables in the tablelist and return it
Parameters: - tableList (list) – a list of Tables
- colName (string) – the name of the column to generalize
Returns: the column having been generalized and removed
Return type: string
-
static
getGeneral(tableList, name: str, debug: bool = False)[source]¶ derive a general table from the given table list :param tableList: a list of tables :type tableList: list :param name: name of the general table :type name: str :param debug: True if column names should be shown :type debug: bool
Returns: at table dict for the generalized table
lodstorage.sparql module¶
Created on 2020-08-14
@author: wf
-
class
lodstorage.sparql.SPARQL(url, mode='query', debug=False, isFuseki=False, typedLiterals=False, profile=False, agent='PyLodStorage', method='POST')[source]¶ Bases:
objectwrapper for SPARQL e.g. Apache Jena, Virtuoso, Blazegraph
Variables: - url – full endpoint url (including mode)
- mode – ‘query’ or ‘update’
- debug – True if debugging is active
- typedLiterals – True if INSERT should be done with typedLiterals
- profile(boolean) – True if profiling / timing information should be displayed
- sparql – the SPARQLWrapper2 instance to be used
- method(str) – the HTTP method to be used ‘POST’ or ‘GET’
Constructor a SPARQL wrapper
Parameters: - url (string) – the base URL of the endpoint - the mode query/update is going to be appended
- mode (string) – ‘query’ or ‘update’
- debug (bool) – True if debugging is to be activated
- typedLiterals (bool) – True if INSERT should be done with typedLiterals
- profile (boolean) – True if profiling / timing information should be displayed
- agent (string) – the User agent to use
- method (string) – the HTTP method to be used ‘POST’ or ‘GET’
-
addAuthentication(username: str, password: str, method: Union[BASIC, DIGEST] = 'BASIC')[source]¶ Add Http Authentication credentials to the sparql wrapper :param username: name of the user :param password: password of the user :param method: HTTP Authentication method
-
asListOfDicts(records, fixNone: bool = False, sampleCount: int = None)[source]¶ convert SPARQL result back to python native
Parameters: - record (list) – the list of bindings
- fixNone (bool) – if True add None values for empty columns in Dict
- sampleCount (int) – the number of samples to check
Returns: a list of Dicts
Return type: list
-
controlChars= ['\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07', '\x08', '\t', '\n', '\x0b', '\x0c', '\r', '\x0e', '\x0f', '\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17', '\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f']¶
-
classmethod
fromEndpointConf(endpointConf) → lodstorage.sparql.SPARQL[source]¶ create a SPARQL endpoint from the given EndpointConfiguration
Parameters: endpointConf (Endpoint) – the endpoint configuration to be used
-
getFirst(qLod: list, attr: str)[source]¶ get the column attr of the first row of the given qLod list
Parameters: - qLod (list) – the list of dicts (returned by a query)
- attr (str) – the attribute to retrieve
Returns: the value
Return type: object
-
getLocalName(name)[source]¶ retrieve valid localname from a string based primary key https://www.w3.org/TR/sparql11-query/#prefNames
Parameters: name (string) – the name to convert Returns: a valid local name Return type: string
-
getResults(jsonResult)[source]¶ get the result from the given jsonResult
Parameters: jsonResult – the JSON encoded result Returns: the list of bindings Return type: list
-
getValue(sparqlQuery: str, attr: str)[source]¶ get the value for the given SPARQL query using the given attr
Parameters: - sparql (SPARQL) – the SPARQL endpoint to ge the value for
- sparqlQuery (str) – the SPARQL query to run
- attr (str) – the attribute to get
-
getValues(sparqlQuery: str, attrList: list)[source]¶ get Values for the given sparlQuery and attribute list
Parameters: - sparqlQuery (str) – the query which did not return any values
- attrList (list) – the list of attributes
-
insert(insertCommand)[source]¶ run an insert
Parameters: insertCommand (string) – the SPARQL INSERT command Returns: a response
-
insertListOfDicts(listOfDicts, entityType, primaryKey, prefixes, limit=None, batchSize=None, profile=False)[source]¶ insert the given list of dicts mapping datatypes
Parameters: - entityType (string) – the entityType to use as a
- primaryKey (string) – the name of the primary key attribute to use
- prefix (string) – any PREFIX statements to be used
- limit (int) – maximum number of records to insert
- batchSize (int) – number of records to send per request
Returns: a list of errors which should be empty on full success
datatype maping according to https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
mapped from https://docs.python.org/3/library/stdtypes.html
compare to https://www.w3.org/2001/sw/rdb2rdf/directGraph/ http://www.bobdc.com/blog/json2rdf/ https://www.w3.org/TR/json-ld11-api/#data-round-tripping https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
-
insertListOfDictsBatch(listOfDicts, entityType, primaryKey, prefixes, title='batch', batchIndex=None, total=None, startTime=None)[source]¶ insert a Batch part of listOfDicts
Parameters: - entityType (string) – the entityType to use as a
- primaryKey (string) – the name of the primary key attribute to use
- prefix (string) – any PREFIX statements to be used
- title (string) – the title to display for the profiling (if any)
- batchIndex (int) – the start index of the current batch
- total (int) – the total number of records for all batches
- starttime (datetime) – the start of the batch processing
Returns: a list of errors which should be empty on full success
-
printErrors(errors)[source]¶ print the given list of errors
Parameters: errors (list) – a list of error strings Returns: True if the list is empty else false Return type: boolean
-
query(queryString, method='POST')[source]¶ get a list of results for the given query
Parameters: - queryString (string) – the SPARQL query to execute
- method (string) – the method eg. POST to use
Returns: list of bindings
Return type: list
-
queryAsListOfDicts(queryString, fixNone: bool = False, sampleCount: int = None)[source]¶ get a list of dicts for the given query (to allow round-trip results for insertListOfDicts)
Parameters: - queryString (string) – the SPARQL query to execute
- fixNone (bool) – if True add None values for empty columns in Dict
- sampleCount (int) – the number of samples to check
Returns: a list ofDicts
Return type: list
lodstorage.sql module¶
Created on 2020-08-24
@author: wf
-
class
lodstorage.sql.EntityInfo(sampleRecords, name, primaryKey=None, debug=False)[source]¶ Bases:
objectholds entity meta Info
Variables: - name(string) – entity name = table name
- primaryKey(string) – the name of the primary key column
- typeMap(dict) – maps column names to python types
- debug(boolean) – True if debug information should be shown
construct me from the given name and primary key
Parameters: - name (string) – the name of the entity
- primaryKey (string) – the name of the primary key column
- debug (boolean) – True if debug information should be shown
-
addType(column, valueType, sqlType)[source]¶ add the python type for the given column to the typeMap
Parameters: - column (string) – the name of the column
- valueType (type) – the python type of the column
-
fixDates(resultList)[source]¶ fix date entries in the given resultList by parsing the date content e.g. converting ‘1926-04-21’ back to datetime.date(1926, 4, 21)
Parameters: resultList (list) – the list of records to be fixed
-
getCreateTableCmd(sampleRecords)[source]¶ get the CREATE TABLE DDL command for the given sample records
Parameters: sampleRecords (list) – a list of Dicts of sample Records Returns: CREATE TABLE DDL command for this entity info Return type: string Example:
CREATE TABLE Person(name TEXT PRIMARY KEY,born DATE,numberInLine INTEGER,wikidataurl TEXT,age FLOAT,ofAge BOOLEAN)
-
getInsertCmd(replace: bool = False) → str[source]¶ get the INSERT command for this entityInfo
Parameters: replace (bool) – if True allow replace for insert Returns: the INSERT INTO SQL command for his entityInfo e.g. Return type: str Example:
INSERT INTO Person (name,born,numberInLine,wikidataurl,age,ofAge) values (?,?,?,?,?,?).
-
class
lodstorage.sql.SQLDB(dbname: str = ':memory:', connection=None, check_same_thread=True, timeout=5, debug=False, errorDebug=False)[source]¶ Bases:
objectStructured Query Language Database wrapper
Variables: - dbname(string) – name of the database
- debug(boolean) – True if debug info should be provided
- errorDebug(boolean) – True if debug info should be provided on errors (should not be used for production since it might reveal data)
Construct me for the given dbname and debug
Parameters: - dbname (string) – name of the database - default is a RAM based database
- connection (Connection) – an optional connection to be reused
- check_same_thread (boolean) – True if object handling needs to be on the same thread see https://stackoverflow.com/a/48234567/1497139
- timeout (float) – number of seconds for connection timeout
- debug (boolean) – if True switch on debug
- errorDebug (boolean) – True if debug info should be provided on errors (should not be used for production since it might reveal data)
-
RAM= ':memory:'¶
-
backup(backupDB, action='Backup', profile=False, showProgress: int = 200, doClose=True)[source]¶ create backup of this SQLDB to the given backup db
see https://stackoverflow.com/a/59042442/1497139
Parameters: - backupDB (string) – the path to the backupdb or SQLDB.RAM for in memory
- action (string) – the action to display
- profile (boolean) – True if timing information shall be shown
- showProgress (int) – show progress at each showProgress page (0=show no progress)
-
copyTo(copyDB, profile=True)[source]¶ copy my content to another database
Parameters: - copyDB (Connection) – the target database
- profile (boolean) – if True show profile information
-
createTable(listOfRecords, entityName: str, primaryKey: str = None, withCreate: bool = True, withDrop: bool = False, sampleRecordCount=1, failIfTooFew=True)[source]¶ derive Data Definition Language CREATE TABLE command from list of Records by examining first recorda as defining sample record and execute DDL command
auto detect column types see e.g. https://stackoverflow.com/a/57072280/1497139
Parameters: - listOfRecords (list) – a list of Dicts
- entityName (string) – the entity / table name to use
- primaryKey (string) – the key/column to use as a primary key
- withDrop (boolean) – true if the existing Table should be dropped
- withCreate (boolean) – true if the create Table command should be executed - false if only the entityInfo should be returned
- sampleRecords (int) – number of sampleRecords expected and to be inspected
- failIfTooFew (boolean) – raise an Exception if to few sampleRecords else warn only
Returns: meta data information for the created table
Return type:
-
execute(ddlCmd)[source]¶ execute the given Data Definition Command
Parameters: ddlCmd (string) – e.g. a CREATE TABLE or CREATE View command
-
executeDump(connection, dump, title, maxErrors=100, errorDisplayLimit=12, profile=True)[source]¶ execute the given dump for the given connection
Parameters: - connection (Connection) – the sqlite3 connection to use
- dump (string) – the SQL commands for the dump
- title (string) – the title of the dump
- maxErrors (int) – maximum number of errors to be tolerated before stopping and doing a rollback
- profile (boolean) – True if profiling information should be shown
Returns: a list of errors
-
getDebugInfo(record, index, executeMany)[source]¶ get the debug info for the given record at the given index depending on the state of executeMany
Parameters: - record (dict) – the record to show
- index (int) – the index of the record
- executeMany (boolean) – if True the record may be valid else not
-
getTableDict(tableType='table')[source]¶ get the schema information from this database as a dict
Parameters: tableType (str) – table or view Returns: Lookup map of tables with columns also being converted to dict Return type: dict
-
getTableList(tableType='table')[source]¶ get the schema information from this database
Parameters: tableType (str) – table or view Returns: a list as derived from PRAGMA table_info Return type: list
-
logError(msg)[source]¶ log the given error message to stderr
Parameters: msg (str) – the error messsage to display
-
query(sqlQuery, params=None)[source]¶ run the given sqlQuery and return a list of Dicts
Parameters: - sqlQuery (string) – the SQL query to be executed
- params (tuple) – the query params, if any
Returns: a list of Dicts
Return type: list
-
queryAll(entityInfo, fixDates=True)[source]¶ query all records for the given entityName/tableName
Parameters: - entityName (string) – name of the entity/table to qury
- fixDates (boolean) – True if date entries should be returned as such and not as strings
-
queryGen(sqlQuery, params=None)[source]¶ run the given sqlQuery a a generator for dicts
Parameters: - sqlQuery (string) – the SQL query to be executed
- params (tuple) – the query params, if any
Returns: a generator of dicts
-
static
restore(backupDB, restoreDB, profile=False, showProgress=200, debug=False)[source]¶ restore the restoreDB from the given backup DB
Parameters: - backupDB (string) – path to the backupDB e.g. backup.db
- restoreDB (string) – path to the restoreDB or in Memory SQLDB.RAM
- profile (boolean) – True if timing information should be shown
- showProgress (int) – show progress at each showProgress page (0=show no progress)
-
showDump(dump, limit=10)[source]¶ show the given dump up to the given limit
Parameters: - dump (string) – the SQL dump to show
- limit (int) – the maximum number of lines to display
-
store(listOfRecords, entityInfo, executeMany=False, fixNone=False, replace=False)[source]¶ store the given list of records based on the given entityInfo
Parameters: - listOfRecords (list) – the list of Dicts to be stored
- entityInfo (EntityInfo) – the meta data to be used for storing
- executeMany (bool) – if True the insert command is done with many/all records at once
- fixNone (bool) – if True make sure empty columns in the listOfDict are filled with “None” values
- replace (bool) – if True allow replace for insert
lodstorage.storageconfig module¶
Created on 2020-08-29
@author: wf
-
class
lodstorage.storageconfig.StorageConfig(mode=<StoreMode.SQL: 3>, cacheRootDir: str = None, cacheDirName: str = 'lodstorage', cacheFile=None, withShowProgress=True, profile=True, debug=False, errorDebug=True)[source]¶ Bases:
objecta storage configuration
Constructor
Parameters: - mode (StoreMode) – the storage mode e.g. sql
- cacheRootDir (str) – the cache root directory to use - if None the home directory will be used
- cacheFile (string) – the common cacheFile to use (if any)
- withShowProgress (boolean) – True if progress should be shown
- profile (boolean) – True if timing / profiling information should be shown
- debug (boolean) – True if debugging information should be shown
- errorDebug (boolean) – True if debug info should be provided on errors (should not be used for production since it might reveal data)
lodstorage.uml module¶
Created on 2020-09-04
@author: wf
-
class
lodstorage.uml.UML(debug=False)[source]¶ Bases:
objectUML diagrams via plantuml
Constructor :param debug: True if debug information should be shown :type debug: boolean
-
mergeSchema(schemaManager, tableList, title=None, packageName=None, generalizeTo=None, withSkin=True)[source]¶ merge Schema and tableList to PlantUml notation
Parameters: - schemaManager (SchemaManager) – a schema manager to be used
- tableList (list) – the tableList list of Dicts from getTableList() to convert
- title (string) – optional title to be added
- packageName (string) – optional packageName to be added
- generalizeTo (string) – optional name of a general table to be derived
- withSkin (boolean) – if True add default BITPlan skin parameters
Returns: the Plantuml notation for the entities in columns of the given tablelist
Return type: string
-
skinparams= "\n' BITPlan Corporate identity skin params\n' Copyright (c) 2015-2020 BITPlan GmbH\n' see http://wiki.bitplan.com/PlantUmlSkinParams#BITPlanCI\n' skinparams generated by com.bitplan.restmodelmanager\nskinparam note {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam component {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam package {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam usecase {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam activity {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam classAttribute {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam interface {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam class {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nskinparam object {\n BackGroundColor #FFFFFF\n FontSize 12\n ArrowColor #FF8000\n BorderColor #FF8000\n FontColor black\n FontName Technical\n}\nhide Circle\n' end of skinparams '\n"¶
-
tableListToPlantUml(tableList, title=None, packageName=None, generalizeTo=None, withSkin=True)[source]¶ convert tableList to PlantUml notation
Parameters: - tableList (list) – the tableList list of Dicts from getTableList() to convert
- title (string) – optional title to be added
- packageName (string) – optional packageName to be added
- generalizeTo (string) – optional name of a general table to be derived
- withSkin (boolean) – if True add default BITPlan skin parameters
Returns: the Plantuml notation for the entities in columns of the given tablelist
Return type: string
-