Documentation - PBMS v0.5.06
PBMS stands for PrimeBase Media Streaming. The PBMS engine is a MySQL storage engine which enables the storage and streaming of BLOB data in and out of a MySQL database.
MySQL 5.1 provides a pluggable storage engine API which allows 3rd parties to extend the storage capabilities of the server. Storage engines are built as shared libraries and can be loaded at runtime by the server. Most storage engines provide an alternative way of storing the table data. By doing this they add functionality to the server in terms of transactional support, special indexing methods, data warehousing and archiving facilities or even clustering and high-availability capabilities.
To use a particular engine, you specify the name of the engine when creating a table in MySQL. For example:
CREATE TABLE my_test_tab (
n_id INTEGER PRIMARY KEY AUTO_INCREMENT,
n_text LONGBLOB,
n_string VARCHAR(255)
) ENGINE=PBXT;
This example creates a table using the PBXT storage engine.
The PBMS engine specializes in the storage and retrieval of BLOB data. Unlike other storage engines, it does not actually provide table storage. So you would never actually create tables of type PBMS.
Instead, PBMS works together with a standard storage engine, in order to add functionality to MySQL tables. The most basic additional functionality provided by PBMS is:
- Direct HTTP stream-based access to database information. For this purpose, the PBMS engine includes a lightweight HTTP server. The port of the server can by setting the pbms_port system variable on the mysqld command line. BLOB data can be uploaded using the HTTP PUT or POST method, and downloaded using the HTTP GET method.
- A repository for BLOB data. If an engine is not suitable for BLOB storage it can use the BLOB repository provided by PBMS. In this case, the engine simply stores a BLOB reference in the table column, instead of the actual BLOB data. PBMS tracks references to the repository in order to know when to free up resources.
2. "Streaming-enabled" Engines
A streaming-enabled storage engine is a MySQL storage engine that supports the PBMS server-side streaming API. PBMS blob references stored in the tables of a streaming-enabled engine will automatically increment and decrement BLOB reference counters when BLOB references are inserted, updated, and deleted. Streaming-enabled engines will also automatically notify the PBMS engine when tables containing blob references are altered or dropped.
The PBMS engine can be used with non streaming-enabled engines through the use of a set of User Defined Functions (UDFs) provided by the PBMS engine. These UDFs can be used in triggers to notify the PBMS engine when BLOB references are inserted, updated, or deleted in tables. UDFs are also provided that can be called to notify the PBMS engine when tables containing blob references are altered or dropped.
A streaming-enabled table is a MySQL table that either belongs to a "Streaming-enabled" engine or has triggers defined on blob referencing columns that increment and decrement BLOB reference counters when BLOB references are inserted, updated, and deleted.
The following is an example of how a streaming-enabled table can be defined for a non streaming-enabled engine:
Create table x.foo(c1 integer, c2 longblob) ENGINE = MYISAM;
create trigger x.foo_insert_trig BEFORE INSERT on x.foo for each row BEGIN set NEW.c2 = pbms_insert_blob_trig("x", "foo", 2, NEW.c2); END
create trigger x.foo_update_trig BEFORE UPDATE on x.foo for each row BEGIN set NEW.c2 = pbms_update_blob_trig("x", "foo", 2, OLD.c2, NEW.c2); END
create trigger x.foo_delete_trig BEFORE UPDATE on x.foo for each row BEGIN declare dummy integer; set dummy = pbms_delete_blob_trig("x", "foo", 2, OLD.c2); END
Streaming-enabled tables can store BLOB data in the PBMS BLOB repository where reference counts to the data are maintained so that the BLOB will automatically be deleted ounce the reference count drops to zero.
Currently, the only engine that is streaming-enabled is PBXT, version 0.9.88 later.
The most recent version of both engines are provided on this web-site. You can download the source code and compile the engines yourself or you can use one of the binaries that are provided. However, binaries are only available for a very limited number of platforms.
A BLOB reference column is a column in a streaming-enabled table which contains PBMS BLOB references and notifies PBMS when a reference in the column is inserted, updated or deleted. For tables created by stream-enabled engines, which columns in a table are treated as BLOB reference columns is engine dependent, currently in PBXT longblob columns are assumed to be BLOB reference columns. For tables created by non stream-enabled engines BLOB reference columns are the columns for which triggers are defined which call the PBMS UDFs to notify PBMS of inserts, updates, and deletes.
The PBMS client API pbms_init_table() and pbms_reset_table_blobs() functions assume that longblob columns are intended to be BLOB reference columns and includes them in the trigger definitions that they create.
4. Building and Installing PBMS
Please refer to the INSTALL document in the source code distribution for details of how to build and install the PBMS plug-in.
5. Accessing the database using BLOB References
PBMS uses the concept of a BLOB reference. A BLOB reference is, in fact, a Uniform Resource Locator (URL), similar to the Web address of an internet page.
There are currently 3 types of BLOB references:
5.1. A BLOB Repository Reference
This type of URL identifies a BLOB in the PBMS BLOB repository, and has two forms:
http://host-name:pbms-port//~*database-name/~repository-id
or
http://host-name:pbms-port//~*database-name/_repository-id
For example: http://localhost:8080/~*test/~1-150-660294d3-0*123 or http://localhost:8080/~*test/_1-246-57d672ac-0*123
The prefix '~*' helps to identify a BLOB repository URL. This is used by BLOB enabled engines to identify BLOB URLs when inserted into the database. The prefix is also used by the JDBC driver and other client software to recognize BLOB URLs.
BLOB repository reference with a '~' character before the repository-id are references that are already associated with a database table. BLOB repository reference with a '_' character before the repository-id are references that are not associated with a particular table, and are the result of uploading a BLOB using a BLOB database reference (see below).
The repository-id is a value generated by the PBMS BLOB repository. It has the following form: table_id-blob_id-access_code-server_id*size for references that are associated with a database table, and repo_id-repo_offset-access_code-server_id*size for references that are not associated with a database table.
The access code (access_code) is a unique 4-byte value generated by PBMS whenever a BLOB is uploaded into the BLOB repository. This code is required to retrieve the BLOB making it very unlikely for anyone to retrieve the BLOB without the correct URL.
The 'size' is the size of the actual BLOB in bytes.
This URL identifies a table in the database, and has the following form:
http://host-name:pbms-port//database-name/table-name
BLOB table references are used to upload data to a particular table in the BLOB repository. For example:
$ curl -d "This is an example text" http://localhost:8080/test/my_test_tab
The result returned by this command will be a partial BLOB repository reference, of the form:
~*database-name/~repository-id
In order to retrieve the value from the repository, add the host name and PBMS port to the URL, for example:
$ curl "http://localhost:8080/~*test/~1-150-660294d3-0*23"
5.3. A BLOB Database Reference
This type of URL identifies a database repository, and has the following form:
http://host-name:pbms-port//database-name
BLOB database references are used to upload data to a BLOB repository without specifying a table. For example:
$ curl -d "This is an example text" http://localhost:8080/test
The result returned by this command will be a partial BLOB repository reference, of the form:
~*database-name/_repository-id
In order to retrieve the value from the repository, add the host name and PBMS port to the URL, for example:
$ curl "http://localhost:8080/~*test/_1-246-57d672ac-0*123"
BLOB table references are more efficient than BLOB database references and should be used when ever the targeted table name is known.
5.4. A BLOB Field Reference (Experimental)
This URL identifies a particular field in the database, and has the form:
http://host-name:pbms-port//database-name/table-name/column-name/row-condition
Where row-condition has the following format: col1=val1&col2=val2&...
BLOB field references can be used to download data from a particular database field using an HTTP GET command. For example, assuming we inserted the following row in the example table from above:
mysql> INSERT my_test_tab (n_id, n_string) VALUES (1, "This is an example string");
To download the 'n_string' field you could do the following:
$ curl http://localhost:8080/test/my_test_tab/n_string/n_id=1
The result returned by this command will be "This is an example string".
Currently, only download of data is supported using BLOB field references. In addition, repository BLOB data cannot yet be uploaded using a BLOB field reference. Further development of this access method will depend on the implementation of adequate security.
NOTE: For security reasons, BLOB field references are not enabled by default. To enable BLOB field reference set the pbms_field_references system variable to TRUE. This can be done on the mysqld command line using --pbms-field-references, or when the server is running as follows:
mysql> set global pbms_field_references=TRUE;
** SECURITY WARNING - PBMS v0.5.02 or EARLIER ** Do not expose the PBMS HTTP port to a non-secure network, since the access to streaming-enabled tables using a BLOB field reference bypasses standard SQL-based security.
NOTE: BLOB field references are only supported by tables created by streaming-enabled engines.
6. Storing BLOB References in Tables
As explained above, BLOB data can be uploaded to the BLOB repository using a BLOB database or table references. The upload is done using any standard HTTP client that supports the HTTP PUT or POST method or the client API mybs_put_data() function.
For example, using curl I can upload BLOB data as follows:
$ curl -d "This is an example text" http://localhost:8080/test
The result will be a partial BLOB repository reference, for example: ~*test/_1-246-57d672ac-0*23.
Now to store this BLOB in a database table, you just insert the BLOB reference. For example:
mysql> INSERT my_test_tab (n_text) VALUES ("~*test/_1-246-57d672ac-0*23");
When the streaming-enable table receives this insert, it will call the PBMS engine in order to "retain" the BLOB repository reference in it's BLOB reference columns. Retain basically means that PBMS increments the reference count for the BLOB. PBMS keeps a list of references to each BLOB so that it knows when a BLOB can be freed in order to reclaim resources.
Note that BLOB references should only be inserted into BLOB reference columns in streaming-enabled tables. Only in this case will PBMS be called to retain the BLOB
The following diagram illustrates the steps that are taken to insert a BLOB into the database.
In step (1) the client software uploads the BLOB to the repository using an HTTP PUT request method. On reply, the repository returns a BLOB repository reference in step (2). Step (3), the reference is placed in the row to be inserted in the database. Using a standard SQL INSERT, the client inserts the row into the database table in step (4). In step (5), the streaming enabled table recognizes that BLOB URL has been inserted and automatically calls the BLOB Streaming engine to retain the BLOB.
7. Retrieving BLOBs from Tables
Once a BLOB reference has been stored in a table, a standard SELECT will return the BLOB reference, not the actual BLOB data.
For example:
mysql> select n_id, n_text from my_test_tab where n_id=5;
+------+-----------------------------------+
| n_id | n_text |
+------+-----------------------------------+
| 5 | ~*test/~2-128-57d672ac-0*23 |
+------+-----------------------------------+
The BLOB reference can then be used to retrieve the BLOB data using the HTTP GET method, for example:
$ curl "http://localhost:8080/~*test/~2-128-57d672ac-0*23"
This is an example text
The following diagram illustrates the process of downloading a BLOB from the repository:

In step (1) the row containing a BLOB repository reference is retrieved from the database using a standard SQL SELECT statement. In step (2) the client software recognizes that a valid BLOB URL is in the database field. With this information it creates a complete URL and downloads the BLOB data from the BLOB repository in step (3), using a standard HTTP GET request method. In step (4) the BLOB data is streamed back to the client application.
The PBMS BLOB repository contains BLOBs uploaded using the HTTP PUT or GET method, for example:
$ curl -d "This is an example text" http://localhost:8080/test
~*test/_1-246-57d672ac-0*23
This operation returns a reference to the BLOB in the repository. The BLOB can be retrieved from the repository using this URL, and the HTTP GET method. For example:
$ curl "http://localhost:8080/~*test/_1-246-57d672ac-0*23"
This is an example text
8.1. The Repository File Structure
Each database has its own BLOB repository. This means that if a BLOB reference is inserted into a table that belongs to another database, the BLOB will be copied over to the repository of the database containing the table.
A repository consists of a number of repository files. These files are stored in a sub-directory called bs-repository in the MySQL database directory. The repository files have names of the form: repo-file_id.bs, for example repo-241.bs. file_id is the ID of the repository file.
Each repository file can contain a number of BLOBs. The number of BLOBs in a repository file depends on the size of the BLOBs and the current setting of the "repository threshold". The repository threshold is determined by setting the pbms_repository_threshold system variable. This is a value in bytes unless you specify: Kb, MB or GB after the value (note: the parameter must be set as a string value).
BLOBs are written sequentially to the repository files. Once a file reaches the threshold, a new repository files is created.
PBMS writes multiple repository files simultaneously. Each BLOB that is being uploaded is written to a different repository file in parallel.
Repository files may be a maximum of 256 Terabytes in size, and the maximum number of repository files is 16777215.
This provides a theoretically maximum repository size of 4 Zettabytes (2^72 bytes). The maximum size of a single BLOB is also 256 Terabytes.
Once a BLOB has been uploaded, PBMS sets a timer. When the timer expires, PBMS deletes the BLOB automatically, unless the BLOB has been retained in the meantime by the act of inserting it into a streaming-enabled table.
This also applies to BLOBs that are deleted or no longer referenced due to a database UPDATE.
The amount of time waited is determined by the pbms_temp_blob_timeout system parameter. This is a value in seconds. By default the value is set to 10 minutes.
The information about which BLOBs are to be deleted and when, is stored in the temporary log files. These files are stored in the bs-logs sub-directory of each database. The temporary log files have names of the form temp-log_id.bs. log_id is the temporary log file ID.
Once a log is full, a new one is created with the next highest ID. The maximum size of a temporary log file is determined by the pbms_temp_log_threshold system variable, which is a value in bytes, unless one of the following is specified: Kb, MB or GB after the value (note: the parameter must be set as a string value).
The maximum temporary log size is 4GB.
When BLOBs are deleted from the repository files, the free space is not immediately re-used. Instead, the system records the amount of free space in the repository file. This is also know as the garbage level of the file, and is measured as a percentage of the size of the file.
When the garbage level of a repository file exceeds a certain threshold PBMS begins to compact the repository file. It does this by creating a new repository file and copying all valid data to the new file. After this is done, it deletes the old repository file.
The garbage threshold is determined by the pbms_garbage_threshold system variable. Set this variable to a value between 1 and 100. The default value is 50, which means that a repository file is compacted as soon as it contains 50% garbage or more.
Note that compaction may not begin immediately. This is due to the fact that some repository files may be locked. A repository file is locked when it is being used to receive uploaded BLOB data. In this case, the file will only be unlocked when it has reached the repository threshold (pbms_repository_threshold).
PBMS has two system tables which provide a view of the BLOB repository. The tables are called pbms_repository and pbms_reference.

The pbms_repository table contains all the BLOBs in the repository. The pbms_reference table contains a row for each reference to a BLOB. References from the temporary log files are also counted.
The system tables are created automatically the first time they are referenced.
As mentioned above, this table contains one row for every BLOB in the repository. The BLOB is identified by the repository file ID (Repository_id column), and the offset into the file (Repo_blob_offset column).
Blob_size is the uncompressed size of the BLOB in bytes. Head_size is the size of the BLOB header which proceeds the BLOB data.
The Access_code column contains the BLOBs unique 4-byte access code.
Creation_time is the time the BLOB was created. Last_ref_time is updated whenever the BLOB is referenced and Last_access_time is updated each time the BLOB is downloaded.
The Content_type field contains the HTTP content type of the data that was specified when the BLOB was uploaded. It is returned in the HTTP header when the BLOB is downloaded.
The Blob_data contains the actual BLOB data of the BLOB. This mean that the data can be retrieved from this table using a normal SQL SELECT statement.
However, BLOBs in the PBMS repository may be larger the 4GB, which is the maximum size of a MySQL LONGBLOB value. Note that other limitations, such as computer memory and the MySQL maximum packet size will also affect the maximum size of BLOBs that can be returned by a SELECT statement.
This table contains one row for every reference to every BLOB in the database. The reference is identified by the table name (Table_name column) and the BLOB ID (Blob_id column).
The Column_name and Row_condition identify the database field that references the BLOB. Row_condition contains a string of the form col1=val1&col2=val2&...
These columns may be NULL if the BLOB is currently not referenced by any row in the table or if the primary key of the table referencing the blob is unknown. In the case of an unreferenced blob, the Remove_in will indicate when the BLOB will be removed if there are no references at all.
Blob_url contains the BLOB repository reference required to retrieve the BLOB.
Repository_id and Repo_blob_offset identify the BLOB in the pbms_repository table.
Blob_size is the uncompressed size of the BLOB in bytes.
Deletion_time is the time the BLOB was deleted. This is the time the BLOB temporary timer was set. The Remove_in column contains the remaining time in seconds, before the reference is deleted. If the reference is removed, and there are not other references to the BLOB, the BLOB will be deleted as well.
Temp_log_id and Temp_log_offset identify where the timer is stored in the temporary log files.
The following system variables are available for controlling the behavior of the PBMS engine.
All variables can be set at runtime except pbms_port.
To set variables on the mysqld command line, use the syntax: --variable_name=value, for example:
--pbms_repository_threshold=10GB.
pbms_field_references is a boolean value and can be enabled on the mysqld command line as follows:
--pbms_field_references, and disabled as follows: --skip_pbms_field_references.
On the command line '-' may be used in place of '_', for example: --pbms-garbage-threshold=20.
Variable that can be set at runtime are set using the SET GLOBAL statement, for example:
mysql> set global pbms_temp_log_threshold = "80MB";
Note that pbms_repository_threshold and pbms_temp_log_threshold are string values and so the value assigned must be quoted. To set the value of pbms_field_references which is a boolean value you can use the constant values TRUE, FALSE or ON, OFF. For example:
mysql> set global pbms_field_references = ON;
11. MySQL Connector/J Streaming Enabled
The MySQL Connector/J Streaming Enabled is a version of the standard MySQL JDBC driver which has been modified to enable BLOB streaming with the PBMS BLOB streaming engine.
The modified source code is available with the standard PBMS source code distribution. An already built .jar file with the modifications is also available from http://www.blobstreaming.org/download.
The Connector/J SE makes working with the PBMS BLOB repository transparent to the programmer. It hides the fact that a reference is stored in the table field instead of the actual BLOB data.
BLOBs of any length can be stored and retrieved from the database without setting the value of the max_allowed_packet system variable. This is because BLOB data is actually streamed to and from the database. Values inserted using setBinaryStream(), for example, are automatically uploaded to the BLOB repository by the JDBC driver using HTTP PUT, as described in the section: 5. Storing BLOB References in Tables.
The JDBC driver automatically retrieve data stored in the repository when it is selected from the database. Using getBinaryStream(), you can take control of the stream of data returned by the HTTP GET operation as described in section 6. Retrieving BLOBs from Tables.
Note that the BLOB data is only retrieved when it is actually required. Just selecting rows which contains BLOB stored in the repository does not cause the JDBC driver to fetch the data from the repository. This means table rows that contain BLOBs stored in the repository do not influence the speed of the SELECT. Only when the data is accessed, for example using getBytes(), getString(), getBinaryStream(), etc. is the data actually retrieved from the repository.
Enable BLOB streaming by adding the following option to the JDBC URL: enableBlobStreaming=true. For example:
Class.forName("com.mysql.jdbc.Driver").newInstance();
Connection conn = null;
try {
conn = DriverManager.getConnection("jdbc:mysql://localhost/test"+
"?user=root&enableBlobStreaming=true");
// Do something with the connection
}
catch (Exception e) {
// Handle the exception
e.printStackTrace();
}
finally {
if (conn != null)
conn.close();
}
When set to true the JDBC driver will upload values set with setBinaryStream(), setAsciiStream() and setBlob() to the PBMS BLOB repository, but only if a valid length is specified. For example, a value inserted using setBinaryStream() with a length of -1 will be not be uploaded to the repository. Instead the value will by inserted directly into the table, as if enableBlobStreaming were set to false.
Here is an example of how to insert a BLOB into a table.
public void insertBLOB(Connection conn, InputStream in, int len)
throws SQLException
{
PreparedStatement ps = null;
try {
ps = conn.prepareStatement("insert my_test_tab(n_text) values(?)");
ps.setBinaryStream(1, in, len);
int rows = ps.executeUpdate();
}
finally {
if (ps != null)
ps.close();
}
}
Note that currently only LONGBLOB type columns of PBXT tables may contain streamable BLOB data. This means that when enableBlobStreaming=true you should only use setBinaryStream(), setAsciiStream() and setBlob() with a valid length on columns of type LONGBLOB belonging to a PBXT table.
** WARNING ** The JDBD driver does not check that the database column or table is the correct type . If use setBinaryStream(), setAsciiStream() or setBlob() with a length greater or equal to one on columns that are not of type LONGBLOB, or table type other then PBXT, data will be lost!
To avoid this problem, specify a length of -1 when using setBinaryStream(), setAsciiStream() or setBlob() on columns that are not of type LONGBLOB, or when using a table type other than PBXT.
Selected values of type LONGBLOB are automatically checked for valid BLOB repository references. The reference is then used to automatically download the BLOB data from the BLOB repository. Using a getBinaryStream(), getAsciiStream() or getBlob() you have direct access to the BLOB data stream from the repository.
Here is an example of how to retrieve a BLOB from a table:
public static InputStream selectBLOB(Connection conn, int id) throws SQLException
{
Statement st = null;
InputStream in = null;
ResultSet rs;
try {
st = conn.createStatement();
rs = st.executeQuery(
"select n_text from pbms_test_tab where n_id = "+id);
rs.next();
in = rs.getBinaryStream("n_text");
}
finally {
if (st != null)
st.close();
}
return in;
}
If you required the length of the BLOB before you begin to retrieve the data, then use getBlob(), instead of rs.getBinaryStream().
The current implementation is based on JDK 1.5.x, which only allows for lengths in setBinaryStream(), setAsciiStream() and setBlob() of type 'int'. This means that the maximum BLOB size that can currently permitted is 2GB.
As mentioned above, streamable BLOBs can only be stored in LONGBLOB type columns in PBXT tables.
11.4. Building and Testing JDBC
This source code distribution of the PBMS storage engine includes the modified source code of the MySQL Connector/J JDBC driver, in a sub-directory called java. This directory also contains a ready build JDBC driver in the form of a .jar file.
For debugging purposes you can include the source code in your project or build the JDBC driver yourself. To do this you will need Apache Ant version 1.6 or newer (available from http://ant.apache.org/). You will also need the Java JDK-1.4.2 or later.
Enter the following command to compile and create the driver:
$ ant dist
After this you can test the BLOB streaming using the 'runtest.sh' shell script:
$ ./runtest.sh
The test connects to a local MySQL server and creates a PBXT table. The test assumes that you have already installed the PBXT and PBMS storage engines.
PBMS provides a set of UDFs that enable tables from non strreaming-enabled engines to be used with the PBMS engine. They should not be used with tables created by streaming-enabled engines. The UDFs can either be used directly in SQL statements sent to the MySQL server or indirectly through PBMS client API calls.
Before they can be used they must be declared. This only needs to be done once on any given MySQL server. You can do this by calling the client API call pbms_init_udfs() or executing the following MySQL statements:
CREATE FUNCTION pbms_insert_blob_trig RETURNS STRING SONAME "libpbms.so";
CREATE FUNCTION pbms_update_blob_trig RETURNS STRING SONAME "libpbms.so";
CREATE FUNCTION pbms_delete_blob_trig RETURNS INTEGER SONAME "libpbms.so";
CREATE FUNCTION pbms_delete_all_blobs_in_table RETURNS INTEGER SONAME "libpbms.so";
CREATE FUNCTION pbms_rename_table_with_blobs RETURNS INTEGER SONAME "libpbms.so";
CREATE FUNCTION pbms_dropping_database RETURNS INTEGER SONAME "libpbms.so";
CREATE FUNCTION pbms_enabled_engine RETURNS INTEGER SONAME "libpbms.so";
| Name |
Description |
| pbms_insert_blob_trig() | Notifies PBMS of data being inserted into a BLOB streaming-enabled column. |
| pbms_update_blob_trig() | Notifies PBMS of data in a BLOB streaming-enabled column being updated. |
| pbms_delete_blob_trig() | Notifies PBMS of data in a BLOB streaming-enabled column being deleted. |
| pbms_delete_all_blobs_in_table() | Notifies PBMS to remove all BLOB references for a table. |
| pbms_rename_table_with_blobs() | Notifies PBMS that a streaming-enabled table has been renamed. |
| pbms_dropping_database() | Notifies PBMS that a database containing streaming-enabled tables is about to be dropped. |
| pbms_enabled_engine() | Test is a specific engine is streaming-enabled. |
STRING pbms_insert_blob_trig(STRING database, STRING table, INTEGER col_position, STRING blob_url)
Description
Notifies PBMS of data being inserted into a BLOB streaming-enabled column.
Parameters
- The target database.
- The target table.
- The ordinal position of the BLOB reference column. The first column in the table has ordinal position 1. This column can be any column type capable of holding a string value of up to 200 characters long. By convention this column is normally a longblob type and the C API functions pbms_init_table() and pbms_reset_table_blobs() assume that BLOB reference columns are longblob type and all longblob columns are BLOB reference column.
- The BLOB reference being inserted.
Return value
- On success the BLOB reference to be inserted into the BLOB reference column is returned. This may be different from the BLOB reference being inserted.
- On failure NULL is returned and an error message is printed to the server's stderr output.
STRING pbms_update_blob_trig(STRING database, STRING table, INTEGER col_position, STRING old_blob_url, STRING new_blob_url)
Description
Notifies PBMS of data in a BLOB streaming-enabled column being updated update.
Parameters
- The target database.
- The target table.
- The ordinal position of the BLOB reference column. The first column in the table has ordinal position 1. This column can be any column type capable of holding a string value of up to 200 characters long. By convention this column is normally a longblob type and the C API functions pbms_init_table() and pbms_reset_table_blobs() assume that BLOB reference columns are longblob type and all longblob columns are BLOB reference column.
- The old BLOB reference being updated.
- The new BLOB reference.
Return value
- On success the new BLOB reference to be used in the update is returned. This may be different from the new BLOB reference passed into the function.
- On failure NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_delete_blob_trig(STRING database, STRING table, INTEGER col_position)
Description
Notifies PBMS of data in a BLOB streaming-enabled column being deleted.
Parameters
- The target database.
- The target table.
- The ordinal position of the BLOB reference column. The first column in the table has ordinal position 1. This column can be any column type capable of holding a string value of up to 200 characters long. By convention this column is normally a longblob type and the C API functions pbms_init_table() and pbms_reset_table_blobs() assume that BLOB reference columns are longblob type and all longblob columns are BLOB reference column.
Return value
- On success 0 is returned.
- On failure 1 or NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_delete_all_blobs_in_table(STRING database, STRING table)
Description
Notifies PBMS to remove all BLOB references for a table. This function should be called after a streaming-enabled table has been dropped.
Parameters
- The target database.
- The target table.
Return value
- On success 0 is returned.
- On failure 1 or NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_rename_table_with_blobs(STRING database, STRING old_table, STRING new_table)
Description
Notifies PBMS that a streaming-enabled table has been renamed This function should be called after a streaming-enabled table has been renamed.
Parameters
- The target database.
- The old target table name.
- The new target table name.
Return value
- On success 0 is returned.
- On failure 1 or NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_dropping_database(STRING database)
Description
Notifies PBMS that a database containing streaming-enabled tables is about to be dropped.
Parameters
- The target database.
Return value
- On success 0 is returned.
- On failure 1 or NULL is returned and an error message is printed to the server's stderr output.
INTEGER pbms_enabled_engine(STRING engine)
Description
Test is a specific engine is streaming-enabled.
Parameters
- The name of the engine being tested.
Return value
- The value 1 is returned if the engine is streaming-enabled.
- The value 0 is returned if the engine is not streaming-enabled.
- On failure NULL is returned and an error message is printed to the server's stderr output.
The C API is provided through the pbmscl library which is installed in the same location as the MySQL libraries when PBMS is installed.
One of the features of the API is that the 'get' and 'put' functions used to transfer the BLOB data between the client and server do so by using callback functions that get called one or more times during the transfer. This enables the client to transfer large BLOBs with out having too buffer the entire BLOB during the transfer.
Example code
An example is included with the PBMS source code under the 'pbmstest' directory. The example demonstrates a couple of different ways in which the API can be used to handle BLOB data. The test application has 2 modes: interactive and 'stress test'.
The interactive mode demonstrates inserting and fetching of BLOB data in the form of text entered by the user or files. The insertion and fetching of files is a good example of how BLOB data can be transferred with out having to buffer the entire BLOB.
The 'stress test' mode uses multiple threads to perform operations on BLOBs in multiple databases and tables at the same time. As the name implies it is intended more as a stress test for the BLOB streaming engine as apposed to a demonstration of any particular API feature.
13.1 Function Descriptions| Name |
Description |
| pbms_library_init() | Initialize PBMS API. This should be called to initialize the PBMS library before you call any other PBMS library function. |
| pbms_library_end() | Terminate PBMS API use. This should be called to free all PBMS library resources when you are done using the PBMS library. |
| pbms_errno() | Returns the error code for the most recently invoked API function that can succeed or fail. |
| pbms_error() | Returns a null-terminated string containing the error message for the most recently invoked API function that can succeed or fail. |
| pbms_connect() | Establishes a connection to a PBMS BLOB streaming server via a host name and port number. |
| pbms_mysql_connect() | Establishes a connection to a PBMS BLOB streaming server via a MySQL connection handle. |
| pbms_close() | Closes the connection to the PBMS BLOB streaming server. |
| pbms_set_option() | Sets a connection option value for the PBMS connection handle. |
| pbms_get_option() | Gets a connection option value for the PBMS connection handle. |
| pbms_put_data() | Sends data to the PBMS BLOB streaming server and receives a BLOB reference after completion. |
| pbms_get_data() | Get data from the PBMS BLOB streaming server. |
| pbms_is_blob_reference() | Checks the format of the BLOB reference and returns true if it looks like a valid reference. |
| pbms_get_blob_size() | Get the size of the BLOB for the specified BLOB reference. |
| pbms_init_udfs() | Declares the PBMS UDFs on the server if they do not already exist. |
| pbms_enabled() | Test is a specific engine is streaming-enabled. |
| Functions used with tables created by non streaming-enabled engines | |
| pbms_init_table() | Converts any longblob columns in the table into streaming-enabled columns by creating PBMS notification triggers on them. |
| pbms_reset_table_blobs() | Recreates PBMS notification triggers on 'longblob' columns in the table. |
| pbms_drop_table_blobs() | Notifies PBMS to remove all BLOB references for a table. |
| pbms_table_renamed() | Notifies PBMS that a streaming-enabled table has been renamed. |
| pbms_dropping_database() | Notifies PBMS that a database containing streaming-enabled tables is about to be dropped. |
| Option |
Parameter type |
Default | Description |
| PBMS_OPTION_ CONNECTION_TIMEOUT |
unsigned int* | 10 | The maximum idle time (in seconds) a connection to the Blob Streaming engine will be held open for. |
| PBMS_OPTION_ HOST |
(const char *)* | localhost | The connection's Blob Streaming server host IP address. (Read Only) |
| PBMS_OPTION_ PORT |
unsigned int* | 8080 | The connection's Blob Streaming server host port number. (Read Only) |
| PBMS_OPTION_ TRANSMITION_TIMEOUT |
unsigned int* | 0 | If greater than zero this sets a limit to how long a blob transfer can take. |
pbms_bool pbms_library_init(pbms_bool thread_safe)
Description
Initialize PBMS API. This should be called to initialize the PBMS library before you call any other PBMS library function.
Parameters
- Set thread_safe to 'true' if the application requires the library to be thread safe.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
void pbms_library_end()
Description
Terminate PBMS API use. This should be called to free all PBMS library resources when you are done using the PBMS library.
int pbms_errno(PBMS pbms)
Description
For the connection specified by pbms, pbms_errno() returns the error code for the most recently invoked API function that can succeed or fail.
Parameters
- The pbms connection the error information is being requested for. If NULL then the last error for a non-connection specific API function is returned.
Return value
- A return code of zero indicates the most recent API function did not report an error.
- The PBMS error code for the most recently invoked API function that can succeed or fail.
const char *pbms_error(PBMS pbms)
Description
For the connection specified by pbms, pbms_error() returns a null-terminated string containing the error message for the most recently invoked API function that can succeed or fail.
Parameters
- The pbms connection the error information is being requested for. If NULL then the last error for a non-connection specific API function is returned.
- A PBMS error string for the most recently invoked API function that can succeed or fail. An empty string means that no error occurred.
PBMS pbms_connect(const char* host, unsigned int port)
Description
Establishes a connection to a PBMS BLOB streaming server running on host. pbms_connect() must complete successfully before you can execute any other API functions that require a valid PBMS connection handle structure.
Parameters
- The hostname or an IP address. If host is NULL or an empty string, a connection to the local host is assumed.
- The TCP/IP port number of the PBMS BLOB streaming server.
- On success A PBMS connection handle. is returned.
- On failure NULL is returned. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
PBMS pbms_mysql_connect(struct st_mysql *mysql)
Description
Establishes a connection to a PBMS BLOB streaming server associated with the MYSQL connection handle.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- On success A PBMS connection handle. is returned.
- On failure NULL is returned. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
void pbms_close(PBMS pbms)
Description
Closes the connection to the PBMS BLOB streaming server associated with the PBMS connection handle and frees all of it's resources.
Parameters
- A valid PBMS connection handle.
pbms_bool pbms_set_option(PBMS pbms, enum pbms_option option, const void *in_value)
Description
Sets a connection option value for the PBMS connection handle.
Parameters
- A valid PBMS connection handle.
- The connection option to be get.
- A pointer to the option dependent value.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
pbms_bool pbms_get_option(PBMS pbms, enum pbms_option option, const void *out_value)
Description
Gets a connection option value for the PBMS connection handle.
Parameters
- A valid PBMS connection handle.
- The connection option to be get.
- A pointer to the option dependent buffer into which the requested value will be placed.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
typedef char *(* PBMS_CALLBACK_FUNC) (void *caller_data, char *buffer, size_t *buffer_size, size_t size);
Description
The user defined callback function is called when transferring data. The callback will continue to be called until an error occurs or the callback returns NULL.
Parameters
- The 'caller_data' parameter passed into the pbms_put_data() or pbms_get_data() function which is a pointer to a user defined data structure.
- The first time he callback is called this will be the buffer pointed to by the 'buffer' parameter in the pbms_put_data() or pbms_get_data() function after that it will be the value returned by the previous call to the callback.
- On input 'buffer_size' contains the actual number of bytes transferred by the last call. On output this parameter is set to the number of bytes of data the user is prepared to transfer. The size of the buffer returned by the callback must be at least this size.
If. the connection option PBMS_OPTION_TRANSMITION_TIMEOUT is set to zero the value passed in will be unaltered from the last time the callback was called, or if this is the first call to the callback, the buffer_size value passed into pbms_put_data() or pbms_get_data().
If the connection option PBMS_OPTION_TRANSMITION_TIMEOUT is not zero and the transfer operation timed out then the value passed in may be less than the maximum bytes that the caller had been prepared to transfer. - The 'size' parameter contains the number of bytes remaining to be transferred.
Return value
- When more data is to be transferred the callback returns a pointer to the next location, into which or from which, the data is to be transferred. The size of this buffer must be at least as large as the value returned in the 'size' parameter.
- NULL is returned when the user is not prepared to send or receive anymore data. If NULL is returned by the callback before the remaining number of bytes to be transferred reaches zero the transfer operation will report an "Unexpected end of data." error.
Example:
See the examples provided for pbms_put_data() and pbms_get_data().
pbms_bool pbms_put_data(MPBMS pbms, size_t size, const char *database, const char *table, PBMS_BLOB_REF *ref, char *buffer, size_t data_size, PBMS_CALLBACK_FUNC cb, void *caller_data)
Description
Sends data to the PBMS BLOB streaming server and receives a BLOB reference after completion.
Parameters
- A valid PBMS connection handle.
- The total size, in bytes, of the data to be sent.
- The name of the database into which the BLOB is to be inserted.
- The name of the table into which the BLOB is to be inserted. This parameter is optional but should be supplied if known to make BLOB insertion more efficient.
- A pointer to a BLOB reference buffer into which the BLOB reference will be place on successful completion of the function call.
- A pointer to the initial block of data. This can be NULL if a callback is supplied.
- The number of bytes of data in 'buffer'. If 'data_size' == 'size' then a callback is not required.
- The callback function through which the data is sent to the server. If a callback is supplied it will be called even if all the data was supplied in the buffer.
- A pointer to user defined data passed into the callback function.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
Example:
#define BUFFER_SIZE 100
char *my_put_file_callback(void *caller_data, char *buffer, size_t *buffer_size, size_t size)
{
int *fh = (int*) caller_data;
void *data = NULL;
if (size) {
data = buffer;
*buffer_size = read(fh, data, BUFFER_SIZE);
}
return data;
}
main ()
{
PBMS pbms;
PBMS_BLOB_REF blob_ref;
char buffer[BUFFER_SIZE];
pbms_library_init(false);
pbms = pbms_connect("localhost", 8080);
char *data = "A test string.";
size_t size = strlen(info.data) +1;
// Send a blob directly with out using the callback:
pbms_put_data(pbms, size, "test", NULL, &blob_ref, data, size, NULL, NULL);
printf("Blob reference = \"%s\"\n", (char *) &blob_ref);
// Send a file using a callback:
int fh = open("myblob.jpg", O_RDONLY, 0000777);
data = NULL;
size = lseek(fh, 0, SEEK_END);
lseek(fh, 0, SEEK_SET);
pbms_put_data(pbms, size, "test", NULL, &blob_ref, buffer, 0, my_put_file_callback, &fh);
printf("Blob reference = \"%s\"\n", (char *) &blob_ref);
close(fh);
pbms_close(pbms);
pbms_library_end();
}
pbms_bool pbms_get_data(PBMS pbms, PBMS_BLOB_REF *ref, char *buffer, size_t buffer_size, PBMS_CALLBACK_FUNC cb, void *caller_data)
Description
Get data from the PBMS BLOB streaming server for the specified BLOB reference.
Parameters
- A valid PBMS connection handle.
- A pointer to a BLOB reference for the data to be received.
- A pointer to the initial initial buffer for the data. This can be NULL if a callback is supplied.
- The size of the buffer. If this is equal to the size of the blob then a callback is not required.
- The callback function through which the data is received from the server. If a callback is supplied it will be called even if the buffer supplied was big enough for all the data.
- A pointer to user defined data passed into the callback function.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
Example:
#define BUFFER_SIZE 100
char *blob_to_file_callback(void *caller_data, char *buffer, size_t *buffer_size, size_t size)
{
i nt *fh = (int*) caller_data;
// Write out the last block of data received.
write(*fh, buffer, *buffer_size);
if (!size)
return NULL; // No more date.
buffer_size = BUFFER_SIZE;
return buffer;
}
char *my_get_callback(void *caller_data, char *buffer, size_t *buffer_size, size_t size)
{
MY_INFO *info = (MY_INFO*) caller_data;
if (!size)
return NULL;
info->size = 0;
return info->data;
}
main ()
{
PBMS pbms;
PBMS_BLOB_REF blob_ref;
char buffer[BUFFER_SIZE];
int fh;
pbms_library_init(false);
pbms = pbms_connect("localhost", 8080);
// Get a blob and write it to a file
get_a_blob_ref_from_some_where(&blob_ref);
fh = open("myblob.jpg", O_WRONLY | O_CREAT | O_TRUNC, 0000777);
pbms_get_data(pbms, &blob_ref, buffer, BUFFER_SIZE, blob_to_file_callback, &fh);
printf("Blob size = %d"\n", lseek(fh, 0, SEEK_END));
close(fh);
pbms_close(pbms);
pbms_library_end();
}
pbms_bool pbms_is_blob_reference(PBMS pbms, PBMS_BLOB_REF *ref)
Description
Checks the format of the BLOB reference and returns true if it looks like a valid reference. This doesn't guarantee that the reference references a valid BLOB on the PBMS BLOB streaming server only that the format of the BLOB reference is correct.
Parameters
- A valid PBMS connection handle.
- A pointer to the potential BLOB reference.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details.
size_t pbms_get_blob_size(PBMS pbms, PBMS_BLOB_REF *ref)
Description
Get the size of the BLOB for the specified BLOB reference.
Parameters
- A valid PBMS connection handle.
- A pointer to a BLOB reference.
- The size of the BLOB referenced in bytes.
- If an error occurs zero is returned. Use pbms_errno() and pbms_error() to retrieve error details.
pbms_bool pbms_init_udfs(struct st_mysql *mysql)
Description
Declares the PBMS UDFs on the server if they do not already exist.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_enabled(struct st_mysql *mysql, const char *engine, pbms_bool *enabled)
Description
Tests if an engine engine is streaming-enabled.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- The name of an engine, "PBXT" or "MYISAM" for example.
- A pointer to a bool flag that will be set to 'true' or 'false' depending if the specified engine is streaming-enabled.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_init_table(struct st_mysql *mysql, const char *database, const char *table, pbms_bool no_blobs_ok)
Description
Converts any longblob columns in the table into streaming-enabled columns by creating PBMS notification triggers on them.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- The name of the target database.
- The name of the target table.
- A flag indication if the failure to find any longblob columns in the table should be treated as an error.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_reset_table_blobs(struct st_mysql *mysql, const char *database, const char *table)
Description
Similar to .pbms_init_table() except that it is meant to be called if an alter table has been done that may have dropped or added longblob columns or changed the ordinals of longblob columns.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- The name of the target database.
- The name of the target table.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_drop_table_blobs(struct st_mysql *mysql, const char *database, const char *table)
Description
Notifies PBMS to remove all BLOB references for a table. This should be called after a streaming-enabled table has been dropped.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- The name of the target database.
- The name of the target table.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_table_renamed(struct st_mysql *mysql, const char *database, const char *old_table, const char *new_table )
Description
Notifies PBMS that a streaming-enabled table has been renamed.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- The name of the target database.
- The old name of the target table.
- The new name of the target table.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
pbms_bool pbms_dropping_database(struct st_mysql *mysql, const char *database)
Description
Notifies PBMS that a database containing streaming-enabled tables is about to be dropped.
Parameters
- A valid MYSQL connection handle connected to a MySQL server with a loaded PBMS engine.
- The name of the target database.
Return value
- Returns true on success.
- Returns false on failure. Use pbms_errno() and pbms_error() to retrieve error details passing NULL for 'pbms'.
Copyright 2008 by PrimeBase Technologies GmbH