Documentation - PBMS v0.5.08
PBMS stands for PrimeBase Media Streaming. The PBMS engine is a MySQL storage engine which enables the storage and streaming of BLOB data in and out of a MySQL database.
MySQL 5.1 provides a pluggable storage engine API which allows 3rd parties to extend the storage capabilities of the server. Storage engines are built as shared libraries and can be loaded at runtime by the server. Most storage engines provide an alternative way of storing the table data. By doing this they add functionality to the server in terms of transactional support, special indexing methods, data warehousing and archiving facilities or even clustering and high-availability capabilities.
To use a particular engine, you specify the name of the engine when creating a table in MySQL. For example:
CREATE TABLE my_test_tab (
n_id INTEGER PRIMARY KEY AUTO_INCREMENT,
n_text LONGBLOB,
n_string VARCHAR(255)
) ENGINE=PBXT;
This example creates a table using the PBXT storage engine.
The PBMS engine specializes in the storage and retrieval of BLOB data. Unlike other storage engines, it does not actually provide table storage. So you would never actually create tables of type PBMS.
Instead, PBMS works together with a standard storage engine, in order to add functionality to MySQL tables. The most basic additional functionality provided by PBMS is:
- Direct HTTP stream-based access to database information. For this purpose, the PBMS engine includes a lightweight HTTP server. The port of the server can by setting the pbms_port system variable on the mysqld command line. BLOB data can be uploaded using the HTTP PUT or POST method, and downloaded using the HTTP GET method.
- A repository for BLOB data. If an engine is not suitable for BLOB storage it can use the BLOB repository provided by PBMS. In this case, the engine simply stores a BLOB reference in the table column, instead of the actual BLOB data. PBMS tracks references to the repository in order to know when to free up resources.
2. "Streaming-enabled" Engines
A streaming-enabled storage engine is a MySQL storage engine that supports the PBMS server-side streaming API. PBMS blob references stored in the tables of a streaming-enabled engine will automatically increment and decrement BLOB reference counters when BLOB references are inserted, updated, and deleted. Streaming-enabled engines will also automatically notify the PBMS engine when tables containing blob references are altered or dropped.
The PBMS engine can be used with non streaming-enabled engines through the use of a set of User Defined Functions (UDFs) provided by the PBMS engine. These UDFs can be used in triggers to notify the PBMS engine when BLOB references are inserted, updated, or deleted in tables. UDFs are also provided that can be called to notify the PBMS engine when tables containing blob references are altered or dropped.
A streaming-enabled table is a MySQL table that either belongs to a "Streaming-enabled" engine or has triggers defined on blob referencing columns that increment and decrement BLOB reference counters when BLOB references are inserted, updated, and deleted.
The following is an example of how a streaming-enabled table can be defined for a non streaming-enabled engine:
Create table x.foo(c1 integer, c2 longblob) ENGINE = MYISAM;
create trigger x.foo_insert_trig BEFORE INSERT on x.foo for each row BEGIN set NEW.c2 = pbms_insert_blob_trig("x", "foo", 2, NEW.c2); END
create trigger x.foo_update_trig BEFORE UPDATE on x.foo for each row BEGIN set NEW.c2 = pbms_update_blob_trig("x", "foo", 2, OLD.c2, NEW.c2); END
create trigger x.foo_delete_trig BEFORE DELETE on x.foo for each row BEGIN declare dummy integer; set dummy = pbms_delete_blob_trig("x", "foo", 2, OLD.c2); END
Streaming-enabled tables can store BLOB data in the PBMS BLOB repository where reference counts to the data are maintained so that the BLOB will automatically be deleted ounce the reference count drops to zero.
Currently, the only engine that is streaming-enabled is PBXT, version 0.9.88 later.
The most recent version of both engines are provided on this web-site. You can download the source code and compile the engines yourself or you can use one of the binaries that are provided. However, binaries are only available for a very limited number of platforms.
A BLOB reference column is a column in a streaming-enabled table which contains PBMS BLOB references and notifies PBMS when a reference in the column is inserted, updated or deleted. For tables created by stream-enabled engines, which columns in a table are treated as BLOB reference columns is engine dependent, currently in PBXT longblob columns are assumed to be BLOB reference columns. For tables created by non stream-enabled engines BLOB reference columns are the columns for which triggers are defined which call the PBMS UDFs to notify PBMS of inserts, updates, and deletes.
The PBMS client API pbms_init_table() and pbms_reset_table_blobs() functions assume that longblob columns are intended to be BLOB reference columns and includes them in the trigger definitions that they create.
4. Building and Installing PBMS
Please refer to the INSTALL document in the source code distribution for details of how to build and install the PBMS plug-in.
5. Accessing the database using BLOB References
PBMS uses the concept of a BLOB reference. A BLOB reference is, in fact, a Uniform Resource Locator (URL), similar to the Web address of an internet page.
There are currently 4 types of BLOB references:
5.1. A BLOB Repository Reference
This type of URL identifies a BLOB in the PBMS BLOB repository, and has two forms:
http://host-name:pbms-port//~*database-name/~repository-id
or
http://host-name:pbms-port//~*database-name/_repository-id
For example: http://localhost:8080/~*test/~1-150-660294d3-0*123 or http://localhost:8080/~*test/_1-246-57d672ac-0*123
The prefix '~*' helps to identify a BLOB repository URL. This is used by BLOB enabled engines to identify BLOB URLs when inserted into the database. The prefix is also used by the JDBC driver and other client software to recognize BLOB URLs.
BLOB repository reference with a '~' character before the repository-id are references that are already associated with a database table. BLOB repository reference with a '_' character before the repository-id are references that are not associated with a particular table, and are the result of uploading a BLOB using a BLOB database reference (see below).
The repository-id is a value generated by the PBMS BLOB repository. It has the following form: table_id-blob_id-access_code-server_id*size for references that are associated with a database table, and repo_id-repo_offset-access_code-server_id*size for references that are not associated with a database table.
The access code (access_code) is a unique 4-byte value generated by PBMS whenever a BLOB is uploaded into the BLOB repository. This code is required to retrieve the BLOB making it very unlikely for anyone to retrieve the BLOB without the correct URL.
The 'size' is the size of the actual BLOB in bytes.
This URL identifies a table in the database, and has the following form:
http://host-name:pbms-port//database-name/table-name
BLOB table references are used to upload data to a particular table in the BLOB repository. For example:
$ curl -d "This is an example text" http://localhost:8080/test/my_test_tab
The result returned by this command will be a partial BLOB repository reference, of the form:
~*database-name/~repository-id
In order to retrieve the value from the repository, add the host name and PBMS port to the URL, for example:
$ curl "http://localhost:8080/~*test/~1-150-660294d3-0*23"
5.3. A BLOB Database Reference
This type of URL identifies a database repository, and has the following form:
http://host-name:pbms-port//database-name
BLOB database references are used to upload data to a BLOB repository without specifying a table. For example:
$ curl -d "This is an example text" http://localhost:8080/test
The result returned by this command will be a partial BLOB repository reference, of the form:
~*database-name/_repository-id
In order to retrieve the value from the repository, add the host name and PBMS port to the URL, for example:
$ curl "http://localhost:8080/~*test/_1-246-57d672ac-0*123"
BLOB table references are more efficient than BLOB database references and should be used when ever the targeted table name is known.
This URL identifies a particular BLOB in the database by it's alias:
http://host-name:pbms-port//database-name/alias
The alias is set by the user when the BLOB is uploaded to the PBMS engine by including the http header 'PBMS_BLOB_ALIAS' and setting it to the BLOB alias you want to use, for example:
$ curl -H "PBMS_BLOB_ALIAS:MyBLOB" -d "This is an example text" http://localhost:8080/test/my_test_tab
The BLOB can then be retreived using a BLOB alias reference:
$ curl "http://localhost:8080/test/MyBLOB"
A BLOB alias is a special type of BLOB metadata and must be unique with in the database.
6. Storing BLOB References in Tables
As explained above, BLOB data can be uploaded to the BLOB repository using a BLOB database or table references. The upload is done using any standard HTTP client that supports the HTTP PUT or POST method or the client API mybs_put_data() function.
For example, using curl I can upload BLOB data as follows:
$ curl -d "This is an example text" http://localhost:8080/test
The result will be a partial BLOB repository reference, for example: ~*test/_1-246-57d672ac-0*23.
Now to store this BLOB in a database table, you just insert the BLOB reference. For example:
mysql> INSERT my_test_tab (n_text) VALUES ("~*test/_1-246-57d672ac-0*23");
When the streaming-enable table receives this insert, it will call the PBMS engine in order to "retain" the BLOB repository reference in it's BLOB reference columns. Retain basically means that PBMS increments the reference count for the BLOB. PBMS keeps a list of references to each BLOB so that it knows when a BLOB can be freed in order to reclaim resources.
Note that BLOB references should only be inserted into BLOB reference columns in streaming-enabled tables. Only in this case will PBMS be called to retain the BLOB
The following diagram illustrates the steps that are taken to insert a BLOB into the database.
In step (1) the client software uploads the BLOB to the repository using an HTTP PUT request method. On reply, the repository returns a BLOB repository reference in step (2). Step (3), the reference is placed in the row to be inserted in the database. Using a standard SQL INSERT, the client inserts the row into the database table in step (4). In step (5), the streaming enabled table recognizes that BLOB URL has been inserted and automatically calls the BLOB Streaming engine to retain the BLOB.
7. Retrieving BLOBs from Tables
Once a BLOB reference has been stored in a table, a standard SELECT will return the BLOB reference, not the actual BLOB data.
For example:
mysql> select n_id, n_text from my_test_tab where n_id=5;
+------+-----------------------------------+
| n_id | n_text |
+------+-----------------------------------+
| 5 | ~*test/~2-128-57d672ac-0*23 |
+------+-----------------------------------+
The BLOB reference can then be used to retrieve the BLOB data using the HTTP GET method, for example:
$ curl "http://localhost:8080/~*test/~2-128-57d672ac-0*23"
This is an example text
The following diagram illustrates the process of downloading a BLOB from the repository:

In step (1) the row containing a BLOB repository reference is retrieved from the database using a standard SQL SELECT statement. In step (2) the client software recognizes that a valid BLOB URL is in the database field. With this information it creates a complete URL and downloads the BLOB data from the BLOB repository in step (3), using a standard HTTP GET request method. In step (4) the BLOB data is streamed back to the client application.
HTTP headers can be used to send and receive aditional information about the BLOB. They can also be used to control the actions taken by the PBMS engine.
Metadata headers are headers that allow additional information to be stored in the repository with the BLOB. A metadata header can be a user defined or a standard HTTP header such as "Content-Type". The system table "pbms_metadata_headers" contains the metadata headers recognized by the PBMS engine, each database has it's own "pbms_metadata_headers" table. The user can insert into this table headers that they wish to be recognized as metadata headers. When the PBMS HTTP server recieves a BLOB any HTTP headers in the 'HTTP Put' request that are also in the "pbms_metadata_headers" table will be stored with the BLOB as metadata.
On retreival all metadata headers along with their values will be added to the HTTP reply.
Example, assume that the string "About_the_blob" exists in the "pbms_metadata_headers" table. The metadata header, along with the BLOB alias could be sent with the BLOB to the PBMS engine as:
$ curl -H "PBMS_BLOB_ALIAS:MyBLOB" -H "About_the_blob:An example" -d "This is an example text" http://localhost:8080/test
and when fetched back by:
$ curl -D - "http://localhost:8080/test/MyBLOB"
The result would be:
HTTP/1.1 200 OK
PBMS_CHECKSUM: C36B9D679BF48385082118FC61C7E709
PBMS_BLOB_SIZE: 23
PBMS_LAST_ACCESS: 1236370182
PBMS_ACCESS_COUNT: 0
PBMS_CREATION_TIME: 1236370182
PBMS_BLOB_ALIAS: MyBLOB
About_the_blob: An example
Content-Length: 23
This is an example text
The "PBMS_BLOB_ALIAS" header is a special metadata header that enables you to associate a string with the BLOB being uploaded which can then be used later to download it again. The alias can be any string upto 1024 bytes long that conforms to the rules for a valid URL and doesn't start with a '/' character.
In addition to the metadata headers the following headers are also returned:
- PBMS_BLOB_ALIAS: The user defined alias associated with the BLOB.
- PBMS_CHECKSUM: The MD5 checksum of the BLOB data.
- PBMS_BLOB_SIZE: The size of the BLOB data in bytes.
- PBMS_LAST_ACCESS: The last access time of the BLOB in seconds since Jan. 1 1970.
- PBMS_ACCESS_COUNT: The number of times the BLOB has been downloaded.
- PBMS_CREATION_TIME: The creation time of the BLOB in seconds since Jan. 1 1970.
Control headers are headers which are sent to the PBMS engine and have special meaning. At this time there is only one control header "PBMS_RETURN_HEADER_ONLY" which signals the PBMS engine to return the header and the BLOB or the header only. Example:
$ curl -H "PBMS_RETURN_HEADER_ONLY:yes" -D - "http://localhost:8080/test/MyBLOB"
The result would be:
HTTP/1.1 200 OK
PBMS_CHECKSUM: C36B9D679BF48385082118FC61C7E709
PBMS_BLOB_SIZE: 23
PBMS_LAST_ACCESS: 1236370182
PBMS_ACCESS_COUNT: 0
PBMS_CREATION_TIME: 1236370182
PBMS_BLOB_ALIAS: MyBLOB
About_the_blob: An example
Content-Length: 0
In the future other control headers are antipated to return general information about the repository or when sent with a BLOB upload to indicate how the BLOB is to be stored.
The PBMS BLOB repository contains BLOBs uploaded using the HTTP PUT or GET method, for example:
$ curl -d "This is an example text" http://localhost:8080/test
~*test/_1-246-57d672ac-0*23
This operation returns a reference to the BLOB in the repository. The BLOB can be retrieved from the repository using this URL, and the HTTP GET method. For example:
$ curl "http://localhost:8080/~*test/_1-246-57d672ac-0*23"
This is an example text
9.1. The Repository File Structure
Each database has its own BLOB repository. This means that if a BLOB reference is inserted into a table that belongs to another database, the BLOB will be copied over to the repository of the database containing the table. The repository doesn't reside in the database folder itself but in a folder with in the pbms folder with the same name as the database folder. The pbms folder resides in the MySQL databases folder along side the databases. When doing a 'show databases' it will apear as a database although it is not infact a database.
A repository consists of a number of repository files. The repository files have names of the form: repo-file_id.bs, for example repo-241.bs. file_id is the ID of the repository file.
Each repository file can contain a number of BLOBs. The number of BLOBs in a repository file depends on the size of the BLOBs and the current setting of the "repository threshold". The repository threshold is determined by setting the pbms_repository_threshold system variable. This is a value in bytes unless you specify: Kb, MB or GB after the value (note: the parameter must be set as a string value).
BLOBs are written sequentially to the repository files. Once a file reaches the threshold, a new repository files is created.
PBMS writes multiple repository files simultaneously. Each BLOB that is being uploaded is written to a different repository file in parallel.
Repository files may be a maximum of 256 Terabytes in size, and the maximum number of repository files is 16777215.
This provides a theoretically maximum repository size of 4 Zettabytes (2^72 bytes). The maximum size of a single BLOB is also 256 Terabytes.
Once a BLOB has been uploaded, PBMS sets a timer. When the timer expires, PBMS deletes the BLOB automatically, unless the BLOB has been retained in the meantime by the act of inserting it into a streaming-enabled table.
This also applies to BLOBs that are deleted or no longer referenced due to a database UPDATE.
The amount of time waited is determined by the pbms_temp_blob_timeout system parameter. This is a value in seconds. By default the value is set to 10 minutes.
The information about which BLOBs are to be deleted and when, is stored in the temporary log files. These files are stored in the bs-logs sub-directory of each database. The temporary log files have names of the form temp-log_id.bs. log_id is the temporary log file ID.
Once a log is full, a new one is created with the next highest ID. The maximum size of a temporary log file is determined by the pbms_temp_log_threshold system variable, which is a value in bytes, unless one of the following is specified: Kb, MB or GB after the value (note: the parameter must be set as a string value).
The maximum temporary log size is 4GB.
When BLOBs are deleted from the repository files, the free space is not immediately re-used. Instead, the system records the amount of free space in the repository file. This is also know as the garbage level of the file, and is measured as a percentage of the size of the file.
When the garbage level of a repository file exceeds a certain threshold PBMS begins to compact the repository file. It does this by creating a new repository file and copying all valid data to the new file. After this is done, it deletes the old repository file.
The garbage threshold is determined by the pbms_garbage_threshold system variable. Set this variable to a value between 1 and 100. The default value is 50, which means that a repository file is compacted as soon as it contains 50% garbage or more.
Note that compaction may not begin immediately. This is due to the fact that some repository files may be locked. A repository file is locked when it is being used to receive uploaded BLOB data. In this case, the file will only be unlocked when it has reached the repository threshold (pbms_repository_threshold).
PBMS has five system tables which provide a view of the BLOB repository. The tables are called pbms_repository , pbms_blob, pbms_metadata, pbms_reference, and pbms_metadata_header.

The pbms_repository, pbms_blob and pbms_metadata tables can be thought of as views into the PBMS BLOB repository files. A reord in the pbms_repository table and it's related records in the pbms_blob and pbms_metadata constitute 1 phsical block of data in a BLOB repository file. The pbms_reference table contains a record for each reference to a BLOB including references from temporary log files.
The system tables are created automatically the first time they are referenced but to clarify their structure a create statement will be given for each table.
10.1 The pbms_repository Table
CREATE TABLE pbms_repository (
Blob_alias CHAR(1024) COMMENT "User supplied BLOB identifier",
Repository_id INT COMMENT 'The repository file number',
Repo_blob_offset BIGINT COMMENT 'The offset of the BLOB in the repository file',
Blob_size BIGINT COMMENT 'The size of the BLOB in bytes',
MD5_Checksum CHAR(32) COMMENT "The MD5 Digest of the BLOB data",
Head_size SMALLINT UNSIGNED COMMENT 'The size of the BLOB header - proceeds the BLOB data',
Access_code INT COMMENT 'The 4-byte authorisation code required to access the BLOB - part of the BLOB URL',
Creation_time TIMESTAMP COMMENT 'The time the BLOB was created',
Last_ref_time TIMESTAMP COMMENT 'The last time the BLOB was referenced',
Last_access_time TIMESTAMP COMMENT 'The last time the BLOB was accessed (read)',
Access_count INT COMMENT 'The count of the number of times the BLOB has been read'
) ENGINE=PBMS;
As mentioned above, this table contains one row for every BLOB in the repository. The BLOB is identified by the repository file ID (Repository_id column), and the offset into the file (Repo_blob_offset column). The Blob_alias, when supplied on upload, can also be used as a unique identifier of the BLOB with in the database. This table is read only.
Blob_size is the uncompressed size of the BLOB in bytes. Head_size is the size of the BLOB header which proceeds the BLOB data.
MD5_Checksum is the hex representation of the MD5 digest of the BLOB data.
The Access_code column contains the BLOBs unique 4-byte access code.
Creation_time is the time the BLOB was created. Last_ref_time is updated whenever the BLOB is referenced and Last_access_time is updated each time the BLOB is downloaded.
The Content_type field contains the HTTP content type of the data that was specified when the BLOB was uploaded. It is returned in the HTTP header when the BLOB is downloaded.
CREATE TABLE pbms_blob (
Repository_id INT COMMENT 'The repository file number',
Repo_blob_offset BIGINT COMMENT 'The offset of the BLOB in the repository file',
Blob_data LONGBLOB COMMENT 'The data of the BLOB',
) ENGINE=PBMS;
This table contains the actual BLOB data. This is a read only table.
Repository_id and Repo_blob_offset identify the BLOB in the pbms_repository table.
The Blob_data contains the actual BLOB data of the BLOB. This mean that the data can be retrieved from this table using a normal SQL SELECT statement.
However, BLOBs in the PBMS repository may be larger the 4GB, which is the maximum size of a MySQL LONGBLOB value. Note that other limitations, such as computer memory and the MySQL maximum packet size will also affect the maximum size of BLOBs that can be returned by a SELECT statement.
CREATE TABLE pbms_metadata (
Repository_id INT COMMENT 'The repository file number',
Repo_blob_offset BIGINT COMMENT 'The offset of the BLOB in the repository file',
Name CHAR( 32) COMMENT 'The name of the metadata',
Value CHAR( 1024) COMMENT 'The value of the metadata',
) ENGINE=PBMS;
This table contains the metadata associated with a BLOB including the BLOB alias. This table is updatable.
Repository_id and Repo_blob_offset identify the BLOB in the pbms_repository table.
Name is the name of the metadata value and when combined with the Repository_id and Repo_blob_offset provide the primary key to this table.
You can add, modify, or delete metadata associated with a BLOB by performing INSERTS, UPDATES, or DELTES on this table. This includes assigning aliases to BLOBs.
10.4. The pbms_reference Table
CREATE TABLE pbms_reference (
Table_name CHAR(64) COMMENT 'The name of the referencing table',
Blob_id BIGINT COMMENT 'The BLOB reference number - part of the BLOB URL',
Column_ordinal INTCOMMENT 'The column position in the table starting at position 1',
Blob_url VARCHAR(200) COMMENT 'The BLOB URL for HTTP GET access',
Repository_id INT COMMENT 'The repository file number of the BLOB',
Repo_blob_offset BIGINT COMMENT 'The offset in the repository file',
Blob_size BIGINT COMMENT 'The size of the BLOB in bytes',
Deletion_time TIMESTAMP COMMENT 'The time the BLOB was deleted',
Remove_in INT COMMENT 'The number of seconds before the reference/BLOB is removed perminently',
Temp_log_id INT COMMENT 'Temporary log number of the referencing deletion entry',
Temp_log_offset BIGINT COMMENT 'Temporary log offset of the referencing deletion entry'
) ENGINE=PBMS;
This table contains one row for every reference to every BLOB in the database. The reference is identified by the table name (Table_name column) and the BLOB ID (Blob_id column).
The Column_ordinal identifys the table column that references the BLOB. This column may be NULL if the BLOB is currently not referenced by any row in the table. In the case of an unreferenced blob, the Remove_in will indicate when the BLOB will be removed if there are no references at all.
Blob_url contains the BLOB repository reference required to retrieve the BLOB.
Repository_id and Repo_blob_offset identify the BLOB in the pbms_repository table.
Blob_size is the uncompressed size of the BLOB in bytes.
Deletion_time is the time the BLOB was deleted. This is the time the BLOB temporary timer was set. The Remove_in column contains the remaining time in seconds, before the reference is deleted. If the reference is removed, and there are not other references to the BLOB, the BLOB will be deleted as well.
Temp_log_id and Temp_log_offset identify where the timer is stored in the temporary log files.
10.5. The pbms_metadata_header Table
CREATE TABLE pbms_metadata_header (
Name CHAR( 32) COMMENT 'The name of the metadata',
) ENGINE=PBMS;
This table contains the HTTP headers that will be treated as metadata names when a BLOB is uploaded. Any header sent with the BLOB that contained in this table will be stored, along with it's value, as metadata in the BLOB repository. This table is updatable.
The following system variables are available for controlling the behavior of the PBMS engine.
| Name | Default | Description |
| pbms_garbage_threshold | 50 | The percentage of garbage in a repository file before it is compacted. |
| pbms_port | 8080 | The port for the PBMS stream-based communications. Setting this value to 0 will disable HTTP communication with the engine. |
| pbms_repository_threshold | 128MB | The maximum size of a BLOB repository file. You may use Kb, MB or GB to indicate the unit of the value. A value in bytes is assumed when no unit is specified. |
| pbms_temp_blob_timeout | 600 | The timeout, in seconds, for temporary BLOBs. Uploaded BLOB data is removed after this time, unless they are referenced by a record in the database. |
| pbms_temp_log_threshold | 32MB | The maximum size of a temporary BLOB log file. You may use Kb, MB or GB to indicate the unit of the value. A value in bytes is assumed when no unit is specified. |
| pbms_max_keep_alive | 10ms | The timeout for inactive connection with the keep-alive flag set. After this time the connection will be closed. The time-out is in milliseconds (1/1000). |
| pbms_metadata_headers | "Content-Type" | A ':' delimited list of metadata headers to be used to initialize the pbms_metadata_header table when a database is created. |
All variables can be set at runtime except pbms_port.
To set variables on the mysqld command line, use the syntax: --variable_name=value, for example:
--pbms_repository_threshold=10GB.
pbms_field_references is a boolean value and can be enabled on the mysqld command line as follows:
--pbms_field_references, and disabled as follows: --skip_pbms_field_references.
On the command line '-' may be used in place of '_', for example: --pbms-garbage-threshold=20.
Variable that can be set at runtime are set using the SET GLOBAL statement, for example:
mysql> set global pbms_temp_log_threshold = "80MB";
Note that pbms_repository_threshold and pbms_temp_log_threshold are string values and so the value assigned must be quoted.
12. MySQL Connector/J Streaming Enabled
The MySQL Connector/J Streaming Enabled is a version of the standard MySQL JDBC driver which has been modified to enable BLOB streaming with the PBMS BLOB streaming engine.
The modified source code is available with the standard PBMS source code distribution. An already built .jar file with the modifications is also available from http://www.blobstreaming.org/download.
The Connector/J SE makes working with the PBMS BLOB repository transparent to the programmer. It hides the fact that a reference is stored in the table field instead of the actual BLOB data.
BLOBs of any length can be stored and retrieved from the database without setting the value of the max_allowed_packet system variable. This is because BLOB data is actually streamed to and from the database. Values inserted using setBinaryStream(), for example, are automatically uploaded to the BLOB repository by the JDBC driver using HTTP PUT, as described in the section: 5. Storing BLOB References in Tables.
The JDBC driver automatically retrieve data stored in the repository when it is selected from the database. Using getBinaryStream(), you can take control of the stream of data returned by the HTTP GET operation as described in section 6. Retrieving BLOBs from Tables.
Note that the BLOB data is only retrieved when it is actually required. Just selecting rows which contains BLOB stored in the repository does not cause the JDBC driver to fetch the data from the repository. This means table rows that contain BLOBs stored in the repository do not influence the speed of the SELECT. Only when the data is accessed, for example using getBytes(), getString(), getBinaryStream(), etc. is the data actually retrieved from the repository.
Enable BLOB streaming by adding the following option to the JDBC URL: enableBlobStreaming=true. For example:
Class.forName("com.mysql.jdbc.Driver").newInstance();
Connection conn = null;
try {
conn = DriverManager.getConnection("jdbc:mysql://localhost/test"+
"?user=root&enableBlobStreaming=true");
// Do something with the connection
}
catch (Exception e) {
// Handle the exception
e.printStackTrace();
}
finally {
if (conn != null)
conn.close();
}
When set to true the JDBC driver will upload values set with setBinaryStream(), setAsciiStream() and setBlob() to the PBMS BLOB repository, but only if a valid length is specified. For example, a value inserted using setBinaryStream() with a length of -1 will be not be uploaded to the repository. Instead the value will by inserted directly into the table, as if enableBlobStreaming were set to false.
Here is an example of how to insert a BLOB into a table.
public void insertBLOB(Connection conn, InputStream in, int len)
throws SQLException
{
PreparedStatement ps = null;
try {
ps = conn.prepareStatement("insert my_test_tab(n_text) values(?)");
ps.setBinaryStream(1, in, len);
int rows = ps.executeUpdate();
}
finally {
if (ps != null)
ps.close();
}
}
Note that currently only LONGBLOB type columns of PBXT tables may contain streamable BLOB data. This means that when enableBlobStreaming=true you should only use setBinaryStream(), setAsciiStream() and setBlob() with a valid length on columns of type LONGBLOB belonging to a PBXT table.
** WARNING ** The JDBD driver does not check that the database column or table is the correct type . If use setBinaryStream(), setAsciiStream() or setBlob() with a length greater or equal to one on columns that are not of type LONGBLOB, or table type other then PBXT, data will be lost!
To avoid this problem, specify a length of -1 when using setBinaryStream(), setAsciiStream() or setBlob() on columns that are not of type LONGBLOB, or when using a table type other than PBXT.
Selected values of type LONGBLOB are automatically checked for valid BLOB repository references. The reference is then used to automatically download the BLOB data from the BLOB repository. Using a getBinaryStream(), getAsciiStream() or getBlob() you have direct access to the BLOB data stream from the repository.
Here is an example of how to retrieve a BLOB from a table:
public static InputStream selectBLOB(Connection conn, int id) throws SQLException
{
Statement st = null;
InputStream in = null;
ResultSet rs;
try {
st = conn.createStatement();
rs = st.executeQuery(
"select n_text from pbms_test_tab where n_id = "+id);
rs.next();
in = rs.getBinaryStream("n_text");
}
finally {
if (st != null)
st.close();
}
return in;
}
If you required the length of the BLOB before you begin to retrieve the data, then use getBlob(), instead of rs.getBinaryStream().
The current implementation is based on JDK 1.5.x, which only allows for lengths in setBinaryStream(), setAsciiStream() and setBlob() of type 'int'. This means that the maximum BLOB size that can currently permitted is 2GB.
As mentioned above, streamable BLOBs can only be stored in LONGBLOB type columns in PBXT tables.
12.4. Building and Testing JDBC
This source code distribution of the PBMS storage engine includes the modified source code of the MySQL Connector/J JDBC driver, in a sub-directory called java. This directory also contains a ready build JDBC driver in the form of a .jar file.
For debugging purposes you can include the source code in your project or build the JDBC driver yourself. To do this you will need Apache Ant version 1.6 or newer (available from http://ant.apache.org/). You will also need the Java JDK-1.4.2 or later.
Enter the following command to compile and create the driver:
$ ant dist
After this you can test the BLOB streaming using the 'runtest.sh' shell script:
$ ./runtest.sh
The test connects to a local MySQL server and creates a PBXT table. The test assumes that you have already installed the PBXT and PBMS storage engines.
PBMS provides a set of UDFs that enable tables from non strreaming-enabled engines to be used with the PBMS engine. They should not be used with tables created by streaming-enabled engines. The UDFs can either be used directly in SQL statements sent to the MySQL server or indirectly through PBMS client API calls.
Before they can be used they must be declared. This only needs to be done once on any given MySQL server. You can do this by calling the client API call pbms_init_udfs() or executing the following MySQL statements:
CREATE FUNCTION pbms_insert_blob_trig RETURNS STRING SONAME "libpbms.so";
CREATE FUNCTION pbms_update_blob_trig RETURNS STRING SONAME "libpbms.so";
CREATE FUNCTION pbms_delete_blob_trig RETURNS INTEGER SONAME "libpbms.so";
CREATE FUNCTION pbms_delete_all_blobs_in_table RETURNS INTEGER SONAME "libpbms.so";
CREATE FUNCTION pbms_rename_table_with_blobs RETURNS INTEGER SONAME "libpbms.so";
CREATE FUNCTION pbms_enabled_engine RETURNS INTEGER SONAME "libpbms.so";
| Name |
Description |
| pbms_insert_blob_trig() | Notifies PBMS of data being inserted into a BLOB streaming-enabled column. |
| pbms_update_blob_trig() | Notifies PBMS of data in a BLOB streaming-enabled column being updated. |
| pbms_delete_blob_trig() | Notifies PBMS of data in a BLOB streaming-enabled column being deleted. |
| pbms_delete_all_blobs_in_table() | Notifies PBMS to remove all BLOB references for a table. |
| pbms_rename_table_with_blobs() | Notifies PBMS that a streaming-enabled table has been renamed. |
| pbms_enabled_engine() | Test is a specific engine is streaming-enabled. |
STRING pbms_insert_blob_trig(STRING database, STRING table, INTEGER col_position, STRING blob_url)
Description
Notifies PBMS of data being inserted into a BLOB streaming-enabled column.
Parameters
- The target database.
- The target table.
- The ordinal position of the BLOB reference column. The first column in the table has ordinal position 1. This column can be any column type capable of holding a string value of up to 200 characters long. By convention this column is normally a longblob type and the C API functions pbms_init_table() and pbms_reset_table_blobs() assume that BLOB reference columns are longblob type and all longblob columns are BLOB reference column.
- The BLOB reference being inserted.
Return value
- On success the BLOB reference to be inserted into the BLOB reference column is returned. This may be different from the BLOB reference being inserted.
- On failure NULL is returned and an error message is printed to the server's stderr output.
STRING pbms_update_blob_trig(STRING database, STRING table, INTEGER col_position, STRING old_blob_url, STRING new_blob_url)
Description
Notifies PBMS of data in a BLOB streaming-enabled column being updated update.
Parameters
- The target database.
- The target table.
- The ordinal position of the BLOB reference column. The first column in the table has ordinal position 1. This column can be any column type capable of holding a string value of up to 200 characters long. By convention this column is normally a longblob type and the C API functions pbms_init_table() and pbms_reset_table_blobs() assume that BLOB reference columns are longblob type and all longblob columns are BLOB reference column.
- The old BLOB reference being updated.
- The new BLOB reference.
Return value
- On success the new BLOB reference to be used in the update is returned. This may be different from the new BLOB reference passed into the function.
- On failure NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_delete_blob_trig(STRING database, STRING table, INTEGER col_position)
Description
Notifies PBMS of data in a BLOB streaming-enabled column being deleted.
Parameters
- The target database.
- The target table.
- The ordinal position of the BLOB reference column. The first column in the table has ordinal position 1. This column can be any column type capable of holding a string value of up to 200 characters long. By convention this column is normally a longblob type and the C API functions pbms_init_table() and pbms_reset_table_blobs() assume that BLOB reference columns are longblob type and all longblob columns are BLOB reference column.
Return value
- On success 0 is returned.
- On failure 1 or NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_delete_all_blobs_in_table(STRING database, STRING table)
Description
Notifies PBMS to remove all BLOB references for a table. This function should be called after a streaming-enabled table has been dropped.
Parameters
- The target database.
- The target table.
Return value
- On success 0 is returned.
- On failure 1 or NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_rename_table_with_blobs(STRING database, STRING old_table, STRING new_table)
Description
Notifies PBMS that a streaming-enabled table has been renamed This function should be called after a streaming-enabled table has been renamed.
Parameters
- The target database.
- The old target table name.
- The new target table name.
Return value
- On success 0 is returned.
- On failure 1 or NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_enabled_engine(STRING engine)
Description
Test is a specific engine is streaming-enabled.
Parameters
- The name of the engine being tested.
Return value
- The value 1 is returned if the engine is streaming-enabled.
- The value 0 is returned if the engine is not streaming-enabled.
- On failure NULL is returned and an error message is printed to the server's stderr output.
The C API is provided through the pbmscl library which is installed in the same location as the MySQL libraries when PBMS is installed.
One of the features of the API is that the 'get' and 'put' functions used to transfer the BLOB data between the client and server do so by using callback functions that get called one or more times during the transfer. This enables the client to transfer large BLOBs with out having too buffer the entire BLOB during the transfer.
Example code
An example is included with the PBMS source code under the 'pbmstest' directory. The example demonstrates a couple of different ways in which the API can be used to handle BLOB data.
The test demonstrates inserting and fetching of BLOB data in the form of text entered by the user or files. The insertion and fetching of files is a good example of how BLOB data can be transferred with out having to buffer the entire BLOB. The test also includes examples on how to set and get BLOB metadata including the use of BLOB aliases.
| Name |
Description |
|
| pbms_library_init() | Initialize PBMS API. This should be called to initialize the PBMS library before you call any other PBMS library function. | |
| pbms_library_end() | Terminate PBMS API use. This should be called to free all PBMS library resources when you are done using the PBMS library. | |
| pbms_errno() | Returns the error code for the most recently invoked API function that can succeed or fail. | |
| pbms_error() | Returns a null-terminated string containing the error message for the most recently invoked API function that can succeed or fail. | |
| pbms_connect() | Establishes a connection to a PBMS BLOB streaming server via a host name and port number. | |
| pbms_mysql_connect() | Establishes a connection to a PBMS BLOB streaming server via a MySQL connection handle. |
|
| pbms_close() | Closes a statement handle or a connection to the PBMS BLOB streaming server. | |
| pbms_set_option() | Sets a connection option value for the PBMS connection handle. | |
| pbms_get_option() | Gets a connection option value for the PBMS connection handle. | |
| pbms_add_metadata() | Adds a metadata header to a PBMS statement. | |
| pbms_clear_metadata() | Removes all metadata headers from a statement. | |
| pbms_reset_metadata() | Resets the metadata cursor to the start returning the number of metadata values in the statement. | |
| pbms_next_metadata() | Gets the next metadata value from the statement's metadata cursor. | |
| pbms_get_metadata_value() | Gets a specific metadata value for the BLOB just downloaded on the statement. | |
| pbms_set_md5_digest() | Set the MD5 digest for the BLOB about to be uploaded on the statement. | |
|
Get the Md5 digest for the BLOB just downloaded on the statement. | |
| pbms_get_info() | Get BLOB headers/metadata only. | |
| pbms_put_data() | Sends data to the PBMS BLOB streaming server and receives a BLOB reference after completion. | |
| pbms_get_data() | Get data from the PBMS BLOB streaming server. | |
| pbms_is_blob_reference() | Checks the format of the BLOB reference and returns true if it looks like a valid reference. | |
| pbms_get_blob_size() | Get the size of the BLOB for the specified BLOB reference. | |
| pbms_init_udfs() | Declares the PBMS UDFs on the server if they do not already exist. | |
| pbms_enabled() | Test is a specific engine is streaming-enabled. | |
| Functions used with tables created by non streaming-enabled engines | ||
| pbms_init_table() | Converts any longblob columns in the table into streaming-enabled columns by creating PBMS notification triggers on them. | |
| pbms_reset_table_blobs() | Recreates PBMS notification triggers on 'longblob' columns in the table. | |
| pbms_drop_table_blobs() | Notifies PBMS to remove all BLOB references for a table. | |
| pbms_table_renamed() | Notifies PBMS that a streaming-enabled table has been renamed. |
| Option |
Parameter type |
Default | Description |
| PBMS_OPTION_ KEEP_ALIVE |
unsigned int* | false | A boolean flag to indicate if the keep-alive flag should be set on HTTP requests. |
| PBMS_OPTION_ HOST |
(const char *)* | localhost | The connection's Blob Streaming server host IP address. (Read Only) |
| PBMS_OPTION_ PORT |
unsigned int* | 8080 | The connection's Blob Streaming server host port number. (Read Only) |
| PBMS_OPTION_ TRANSMITION_TIMEOUT |
unsigned int* | 0 | If greater than zero this sets a limit to how long a blob transfer can take. |
| PBMS_OPTION_ DATABASE |
(const char*)* | none | The database associated with the connection. |
pbms_bool pbms_library_init(pbms_bool thread_safe)
Description
Initialize PBMS API. This should be called to initialize the PBMS library before you call any other PBMS library function.
Parameters
- thread_safe: Set thread_safe to 'true' if the application requires the library to be thread safe.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
void pbms_library_end()
Description
Terminate PBMS API use. This should be called to free all PBMS library resources when you are done using the PBMS library.
int pbms_errno(PBMS pbms)
Description
For the handle specified by pbms, pbms_errno() returns the error code for the most recently invoked API function that can succeed or fail.
Parameters
- pbms: The pbms handle the error information is being requested for. If NULL then the last error for a non-handle specific API function is returned.
Return value
- A return code of zero indicates the most recent API function did not report an error.
- The PBMS error code for the most recently invoked API function that can succeed or fail.
const char *pbms_error(PBMS pbms)
Description
For the handle specified by pbms, pbms_error() returns a null-terminated string containing the error message for the most recently invoked API function that can succeed or fail.
Parameters
- pbms: The pbms handle the error information is being requested for. If NULL then the last error for a non-handle specific API function is returned.
- A PBMS error string for the most recently invoked API function that can succeed or fail. An empty string means that no error occurred.
PBMSC pbms_connect(const char* host, unsigned int port, const char *database)
Description
Establishes a connection to a PBMS BLOB streaming server running on host. pbms_connect() must complete successfully before you can execute any other API functions that require a valid PBMS connection handle structure.
Internal connections to the PBMS server are only opened as needed and are closed after each access unless the PBMS_OPTION_KEEP_ALIVE is set to true. Even then the connection will only remain open for as long as the server's max_keep_alive setting allows.
Parameters
- host: The hostname or an IP address. If host is NULL or an empty string, a connection to the local host is assumed.
- port: The TCP/IP port number of the PBMS BLOB streaming server.
- database: The database to associate with this connection. This can be changed later with pbms_set_option() and PBMS_OPTION_DATABASE.
- On success A PBMS connection handle. is returned.
- On failure NULL is returned. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
PBMS pbms_mysql_connect(struct st_mysql *mysql, const char *database)
Description
Establishes a connection to a PBMS BLOB streaming server associated with the MYSQL connection handle.
Parameters
- mysql: A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- database: The database to associate with this connection. This can be changed later with pbms_set_option() and PBMS_OPTION_DATABASE.
- On success A PBMS connection handle. is returned.
- On failure NULL is returned. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
void pbms_close(PBMS pbms)
Description
Closes the connection and frees all of it's resources.
Parameters
- pbms: A valid PBMS handle.
pbms_bool pbms_set_option(PBMS pbms, enum pbms_option option, const void *in_value)
Description
Sets a connection option value for the PBMS handle.
Parameters
- pbms: A valid PBMS handle.
- option: The connection option to be get.
- in_value: A pointer to the option dependent value.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
pbms_bool pbms_get_option(PBMS pbms, enum pbms_option option, const void *out_value)
Description
Gets a connection option value for the PBMS handle.
Parameters
- pbms: A valid PBMS connection handle.
- option: The connection option to be get.
- out_value: A pointer to the option dependent buffer into which the requested value will be placed.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
pbms_bool pbms_add_metadata(PBMS pbms, const char *name, const char *value)
Description
Adds the metadata to the connection to be associated with the next BLOB uploaded on the connection.
Parameters
- pbms: A valid PBMS handle.
- name: The name of the metadata. The name cannot exceed PBMS_META_NAME_SIZE bytes.
- value: The value of the metadata. The value cannot exceed PBMS_META_VALUE_SIZE bytes.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
pbms_bool pbms_clear_metadata(PBMS pbms)
Description
Removes all metadata associated with the connection.
Parameters
- pbms: A valid PBMS handle.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
int pbms_reset_metadata(PBMS pbms)
Description
Resets the position of the metadata cursor to the start of the metadata for the downloaded BLOB.
Parameters
- pbms: A valid PBMS handle.
Return value
- Returns the number of metadata records in the cursor.
pbms_bool pbms_next_metadata(PBMS pbms, char *name, char *value, size_t *v_size)
Description
Fetches the next metadata record.
Parameters
- pbms: A valid PBMS handle.
- name: A buffer for the name of the metadata. This buffer must be atleast PBMS_META_NAME_SIZE big.
- value: A buffer for the metadata value.
- v_size: A pointer to a variable containing the size of the 'value' buffer. On return this variable will contain the actual size required for the value. If in doubt the caller should check this variable after the call to see if the value was truncated. If NULL is passed in then the buffer is assumed to be large enough to hold the metadata value.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details. If false is returned but there is no error then the end of the cursor has been reached.
pbms_bool pbms_get_metadata_value(PBMS pbms, const char *name, char *value, size_t v_size)
Description
Gets a specific metadata value from the connection's metadata cursor.
Parameters
- pbms: A valid PBMS handle.
- name: The name of the metadata to get.
- value: A buffer for the metadata value.
- v_size: A pointer to a variable containing the size of the 'value' buffer. On return this variable will contain the actual size required for the value. If in doubt the caller should check this variable after the call to see if the value was truncated. If NULL is passed in then the buffer is assumed to be large enough to hold the metadata value.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.If false is returned but there is no error then metadata value could not be found.
pbms_bool pbms_set_md5_digest(PBMS pbms, const char *md5_digest)
Description
Specifies the MD5 digest for the BLOB data about to be sent. This is not required but is a good thing to provide if you are concerned about possible corruption of data in transit.
Parameters
- pbms: A valid PBMS handle.
- md5_digest: The 16 byte MD5 digest string.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
pbms_bool pbms_get_md5_digest(PBMS pbms, char *md5_digest)
Description
Gets the MD5 digest string of the BLOB just downloaded..
Parameters
- pbms: A valid PBMS handle.
- md5_digest: The 16 byte MD5 digest string.
Return value
- Returns true on success
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
typedef char *(* PBMS_CALLBACK_FUNC) (void *caller_data, char *buffer, size_t *buffer_size, size_t size);
Description
The user defined callback function is called when transferring data. The callback will continue to be called until an error occurs or the callback returns NULL.
Parameters
- caller_data: The 'caller_data' parameter passed into the pbms_put_data() or pbms_get_data() function which is a pointer to a user defined data structure.
- buffer: The first time he callback is called this will be the buffer pointed to by the 'buffer' parameter in the pbms_put_data() or pbms_get_data() function after that it will be the value returned by the previous call to the callback.
- buffer_size: On input 'buffer_size' contains the actual number of bytes transferred by the last call. On output this parameter is set to the number of bytes of data the user is prepared to transfer. The size of the buffer returned by the callback must be at least this size.
If. the connection option PBMS_OPTION_TRANSMITION_TIMEOUT is set to zero the value passed in will be unaltered from the last time the callback was called, or if this is the first call to the callback, the buffer_size value passed into pbms_put_data() or pbms_get_data().
If the connection option PBMS_OPTION_TRANSMITION_TIMEOUT is not zero and the transfer operation timed out then the value passed in may be less than the maximum bytes that the caller had been prepared to transfer. - size: The 'size' parameter contains the number of bytes remaining to be transferred.
Return value
- When more data is to be transferred the callback returns a pointer to the next location, into which or from which, the data is to be transferred. The size of this buffer must be at least as large as the value returned in the 'size' parameter.
- NULL is returned when the user is not prepared to send or receive anymore data. If NULL is returned by the callback before the remaining number of bytes to be transferred reaches zero the transfer operation will report an "Unexpected end of data." error.
Example:
See the examples provided for pbms_put_data() and pbms_get_data().
pbms_bool pbms_put_data(PBMS pbms, const char *table, const char *alias, char *ref, size_t total_data_size, char *buffer, size_t data_size, PBMS_CALLBACK_FUNC cb, void *caller_data)
Description
Sends data to the PBMS BLOB streaming server and receives a BLOB reference after completion.
Parameters
- pbms: A valid PBMS connection handle.
- table: The name of the table into which the BLOB is to be inserted. This parameter is optional but should be supplied if known to make BLOB insertion more efficient.
- alias: The user defined BLOB alias. This parameter can be NULL but if supplied the alias must be unique with in the database.
- ref: A pointer to a BLOB reference buffer into which the BLOB reference will be place on successful completion of the function call.
- total_data_size: The total size, in bytes, of the data to be sent.
- buffer: A pointer to the initial block of data. This can be NULL if a callback is supplied.
- data_size: The number of bytes of data in 'buffer'. If 'data_size' == 'total_data_size' then a callback is not required.
- cb: The callback function through which the data is sent to the server. If a callback is supplied it will be called even if all the data was supplied in the buffer.
- caller_data: A pointer to user defined data passed into the callback function.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
Example:
#define BUFFER_SIZE 100
char *my_put_file_callback(void *caller_data, char *buffer, size_t *buffer_size, size_t size)
{
int *fh = (int*) caller_data;
void *data = NULL;
if (size) {
data = buffer;
*buffer_size = read(fh, data, BUFFER_SIZE);
}
return data;
}
main ()
{
PBMS pbms;
PBMS_BLOB_REF blob_ref;
char buffer[BUFFER_SIZE];
pbms_library_init(false);
pbms = pbms_connect("localhost", 8080, "test");
char *data = "A test string.";
size_t size = strlen(info.data) +1;
// Send a blob directly with out using the callback:
pbms_put_data(pbms, NULL, "myBlob1", &blob_ref, size, data, size, NULL, NULL);
printf("Blob reference = \"%s\"\n", (char *) &blob_ref);
// Send a file using a callback:
int fh = open("myblob.jpg", O_RDONLY, 0000777);
data = NULL;
size = lseek(fh, 0, SEEK_END);
lseek(fh, 0, SEEK_SET);
pbms_put_data(pbms, NULL , "MyBlob2", &blob_ref, size, buffer, 0, my_put_file_callback, &fh);
printf("Blob reference = \"%s\"\n", (char *) &blob_ref);
close(fh);
pbms_close(pbms);
pbms_library_end();
}
pbms_bool pbms_get_data(PBMS pbms, const char *ref, char *buffer, size_t buffer_size, PBMS_CALLBACK_FUNC cb, void *caller_data)
Description
Get data from the PBMS BLOB streaming server for the specified BLOB reference or alias.
Parameters
- pbms: A valid PBMS connection handle.
- ref: A pointer to a BLOB reference for the data to be received.
- buffer: A pointer to the initial initial buffer for the data. This can be NULL if a callback is supplied.
- buffer_size: The size of the buffer. If this is equal to the size of the blob then a callback is not required.
- cb: The callback function through which the data is received from the server. If a callback is supplied it will be called even if the buffer supplied was big enough for all the data.
- caller_data: A pointer to user defined data passed into the callback function.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
Example:
#define BUFFER_SIZE 100
char *blob_to_file_callback(void *caller_data, char *buffer, size_t *buffer_size, size_t size)
{
i nt *fh = (int*) caller_data;
// Write out the last block of data received.
write(*fh, buffer, *buffer_size);
if (!size)
return NULL; // No more date.
buffer_size = BUFFER_SIZE;
return buffer;
}
char *my_get_callback(void *caller_data, char *buffer, size_t *buffer_size, size_t size)
{
MY_INFO *info = (MY_INFO*) caller_data;
if (!size)
return NULL;
info->size = 0;
return info->data;
}
main ()
{
PBMS pbms;
char buffer[BUFFER_SIZE];
int fh;
pbms_library_init(false);
pbms = pbms_connect("localhost", 8080, "test");
fh = open("myblob.jpg", O_WRONLY | O_CREAT | O_TRUNC, 0000777);
pbms_get_data(pbms, "MyBlob1", buffer, BUFFER_SIZE, blob_to_file_callback, &fh);
printf("Blob size = %d"\n", lseek(fh, 0, SEEK_END));
close(fh);
pbms_close(pbms);
pbms_library_end();
}
pbms_bool pbms_get_info(PBMS pbms, const char *ref )
Description
Get metadata and headers from the PBMS BLOB streaming server for the specified BLOB reference or alias.
Parameters
- pbms: A valid PBMS connection handle.
- ref: A pointer to a BLOB reference for the information to be received.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
Example:
main ()
{
PBMS pbms;
char buffer[PBMS_META_VALUE_SIZE];
time_t secs;
pbms_library_init(false);
pbms = pbms_connect("localhost", 8080, "test");
pbms_get_info(pbms, "MyBlob1");
printf("BLOB Info:\n");
pbms_get_metadata_value(pbms, MS_ALIAS_TAG, buffer, NULL);
printf("ALIAS: %s", buffer);
pbms_get_metadata_value(pbms, MS_BLOB_SIZE, buffer, NULL);
printf("SIZE: %s", buffer);
pbms_get_metadata_value(pbms, MS_CHECKSUM_TAG, buffer, NULL);
printf("CHECKSUM: %s", buffer);
pbms_get_metadata_value(pbms, MS_ACCESS_COUNT, buffer, NULL);
printf("ACCESS_COUNT: %s", buffer);
pbms_get_metadata_value(pbms, MS_CREATION_TIME, buffer, NULL);
secs = atol(buffer);
ctime_r(secs, buffer);
printf("CREATION_TIME: %s", buffer);
pbms_get_metadata_value(pbms, MS_LAST_ACCESS, buffer, NULL);
secs = atol(buffer);
ctime_r(secs, buffer);
printf("LAST_ACCESS: %s", buffer);
pbms_close(pbms);
pbms_library_end();
}
pbms_bool pbms_is_blob_reference(PBMS pbms, const char *ref)
Description
Checks the format of the BLOB reference and returns true if it looks like a valid reference. This doesn't guarantee that the reference references a valid BLOB on the PBMS BLOB streaming server only that the format of the BLOB reference is correct.
Parameters
- pbms: A valid PBMS connection handle.
- ref: A pointer to the potential BLOB reference.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
pbms_bool pbms_get_blob_size(PBMS pbms, const char *ref, size_t *size)
Description
Get the size of the BLOB for the specified BLOB reference.
If the reference is a PBMS BLOB URL then the blob size is imbedded in the URL and so the size can be returned directly. But it the ref is not a URL it is assumed to be a BLOB alias and so a request will be sent to the PBMS server to get the information.
Parameters
- pbms: A valid PBMS connection handle.
- ref: A BLOB reference, either a PBMS BLOB URL or a BLOB alias.
- size: The address of the size valiable in which the BLOB size will be returned.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_init_udfs(struct st_mysql *mysql)
Description
Declares the PBMS UDFs on the server if they do not already exist.
Parameters
- mysql: A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_enabled(struct st_mysql *mysql, const char *engine, pbms_bool *enabled)
Description
Tests if an engine engine is streaming-enabled.
Parameters
- mysql: A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- engine: The name of an engine, "PBXT" or "MYISAM" for example.
- enabled: A pointer to a bool flag that will be set to 'true' or 'false' depending if the specified engine is streaming-enabled.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_init_table(struct st_mysql *mysql, const char *database, const char *table, pbms_bool no_blobs_ok)
Description
Converts any longblob columns in the table into streaming-enabled columns by creating PBMS notification triggers on them.
Parameters
- mysql: A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- database: The name of the target database.
- table: The name of the target table.
- no_blobs_ok: A flag indication if the failure to find any longblob columns in the table should be treated as an error.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_reset_table_blobs(struct st_mysql *mysql, const char *database, const char *table)
Description
Similar to .pbms_init_table() except that it is meant to be called if an alter table has been done that may have dropped or added longblob columns or changed the ordinals of longblob columns.
Parameters
- mysql: A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- database: The name of the target database.
- table: The name of the target table.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_drop_table_blobs(struct st_mysql *mysql, const char *database, const char *table)
Description
Notifies PBMS to remove all BLOB references for a table. This should be called after a streaming-enabled table has been dropped.
Parameters
- mysql: A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- database: The name of the target database.
- table: The name of the target table.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_table_renamed(struct st_mysql *mysql, const char *database, const char *old_table, const char *new_table )
Description
Notifies PBMS that a streaming-enabled table has been renamed.
Parameters
- mysql: A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- database: The name of the target database.
- old_table:The old name of the target table.
- new_table:The new name of the target table.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
Copyright 2009 by PrimeBase Technologies GmbH