The program in Example 8-22 implements the HTTP Range feature, which allows clients to request one or more sections of a file.
This is most frequently used to download the remaining portion of a file that was interrupted. For example, only fetching the remaining part of a movie that the viewer stopped watching.
Normally, your web server can handle this for you. It will parse the header, load in the selected portions of the file, and serve them back to the browser (along with the necessary HTTP).
However, if you sell multimedia, such as podcasts or music, you don’t want to expose those files directly. Otherwise, anyone who got the URL could download the files.
Instead, you want to make sure only people who purchased the file are able to read it. And, for that, you can’t use the web server by itself, but need PHP.
But that recipe only works for sending an entire file. This program expands upon that simpler example to enable
sending only the sections of the file requested by the web browser.
At first glance, this doesn’t sound difficult. However, the HTTP 1.1 specification has a number of features that layer on complexity, such as multiple ranges (with a different syntax for these replies), offsets from the end of the file (e.g., “only the last 1000 bytes”), and specific status codes and headers for invalid requests.
Example 8-22. HTTP Range
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
// Add your authenication here, optionally. // The file $file = __DIR__ . '/numbers.txt'; $content_type = 'text/plain'; // Check that it's readable and get the file size if (($filelength = filesize($file)) === false) { error_log("Problem reading filesize of $file."); } // Parse header to determine info needed to send response if (isset($_SERVER['HTTP_RANGE'])) { // Delimiters are case insensitive if (!preg_match('/bytes=\d*-\d*(,\d*-\d*)*$/i', $_SERVER['HTTP_RANGE'])) { error_log("Client requested invalid Range."); send_error($filelength); exit; } /* Spec: "When a client requests multiple byte-ranges in one request, the server SHOULD return them in the order that they appeared in the request." */ $ranges = explode(',', substr($_SERVER['HTTP_RANGE'], 6)); // everything after bytes= $offsets = array(); // Extract and validate each offset // Only keep the ones that pass foreach ($ranges as $range) { $offset = parse_offset($range, $filelength); if ($offset !== false) { $offsets[] = $offset; } } /* Depending on the number of valid ranges requested, you must return the response in a different format */ switch (count($offsets)) { case 0: // No valid ranges error_log("Client requested no valid ranges."); send_error($filelength); exit; break; case 1: // One valid range, send standard reply http_response_code(206); // Partial Content list($start, $end) = $offsets[0]; header("Content-Range: bytes $start-$end/$filelength"); header("Content-Type: $content_type"); // Set variables to allow code reuse across this case and the next one // Note: 0-0 is 1 byte long, because we're inclusive $content_length = $end - $start + 1; $boundaries = array(0 => '', 1 => ''); break; default: // Multiple valid ranges, send multipart reply http_response_code(206); // Partial Content $boundary = str_rand(32); // String to separate each part /* Need to compute Content-Length of entire response, but loading the entire response into a string could use a lot of memory, so calculate value using the offsets. Take this opportunity to also calculate the boundaries. */ $boundaries = array(); $content_length = 0; foreach ($offsets as $offset) { list($start, $end) = $offset; // Used to split each section $boundary_header = "\r\n" . "--$boundary\r\n" . "Content-Type: $content_type\r\n" . "Content-Range: bytes $start-$end/$filelength\r\n" . "\r\n"; $content_length += strlen($boundary_header) + ($end - $start + 1); $boundaries[] = $boundary_header; } // Add the closing boundary $boundary_header = "\r\n--$boundary--"; $content_length += strlen($boundary_header); $boundaries[] = $boundary_header; // Chop off extra \r\n in first boundary $boundaries[0] = substr($boundaries[0], 2); $content_length -= 2; // Change to the special multipart Content-Type $content_type = "multipart/byteranges; boundary=$boundary"; } } else { // Send the entire file // Set variables as if this was extracted from Range header $start = 0; $end = $filelength - 1; $offset = array($start, $end); $offsets = array($offset); $content_length = $filelength; $boundaries = array(0 => '', 1 => ''); } // Tell us what we're getting header("Content-Type: $content_type"); header("Content-Length: $content_length"); // Give it to us $handle = fopen($file, 'r'); if ($handle) { $offsets_count = count($offsets); // Print each boundary delimiter and the appropriate part of the file for ($i = 0; $i < $offsets_count; $i++) { print $boundaries[$i]; list($start, $end) = $offsets[$i]; send_range($handle, $start, $end); } // Closing boundary print $boundaries[$i]; fclose($handle); } // Move the proper place in the file // And print out the requested piece in chunks function send_range($handle, $start, $end) { $line_length = 4096; // magic number if (fseek($handle, $start) === -1) { error_log("Error: fseek() fail."); } $left_to_read = $end - $start + 1; do { $length = min($line_length, $left_to_read); if (($buffer = fread($handle, $length)) !== false) { print $buffer; } else { error_log("Error: fread() fail."); } } while ($left_to_read -= $length); } // Send the failure header function send_error($filelength) { http_response_code(416); header("Content-Range: bytes */$filelength"); // Required in 416. } // Convert an offset to the start and end locations in the file // Or return false if it's invalid function parse_offset($range, $filelength) { /* Spec: "The first-byte-pos value in a byte-range-spec gives the byte-offset of the first byte in a range." Spec: "The last-byte-pos value gives the byte-offset of the last byte in the range; that is, the byte positions specified are inclusive." */ list($start, $end) = explode('-', $range); /* Spec: "A suffix-byte-range-spec is used to specify the suffix of the entity-body, of a length given by the suffix-length value." */ if ($start === '') { if ($end === '' || $end === 0) { // Asked for range of "-" or "-0" return false; } else { /* Spec: "If the entity is shorter than the specified suffix-length, the entire entity-body is used." Spec: "Byte offsets start at zero." */ $start = max(0, $filelength - $end); $end = $filelength - 1; } } else { /* Spec: "If the last-byte-pos value is absent, or if the value is greater than or equal to the current length of the entity-body, last-byte-pos is taken to be equal to one less than the current length of the entity body in bytes." */ if ($end === '' || $end > $filelength - 1) { $end = $filelength - 1; } /* Spec: "If the last-byte-pos value is present, it MUST be greater than or equal to the first-byte-pos in that byte-range-spec, or the byte-range-spec is syntactically invalid." This also catches cases where start > filelength */ if ($start > $end) { return false; } } return array($start, $end); } // Generate a random string to delimit sections within the response function str_rand($length = 32, $characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ') { if (!is_int($length) || $length < 0) { return false; } $characters_length = strlen($characters) - 1; $string = ''; for ($i = $length; $i > 0; $i--) { $string .= $characters[mt_rand(0, $characters_length)]; } return $string; } |
For simplicity, the demonstration file, numbers.txt, looks like:
1 |
01234567890123456789 |
Here’s how it behaves, making requests from the command-line curl program to the built-in PHP webserver. This dumps a verbose version of the HTTP exchange.
The entire file, without any Range header:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
$ curl -v http://localhost:8000/range.php * About to connect() to localhost port 8000 (#0) * Trying ::1... * connected * Connected to localhost (::1) port 8000 (#0) > GET /range.php HTTP/1.1 > User-Agent: curl/7.24.0 > Host: localhost:8000 > Accept: */* > [Sun Aug 18 14:33:36 2013] ::1:59812 [200]: /range.php < HTTP/1.1 200 OK < Host: localhost:8000 < Connection: close < X-Powered-By: PHP/5.4.9 < Content-Type: text/plain < Content-Length: 10 < * Closing connection #0 0123456789 |
Only the first 5 bytes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
$ curl -v -H 'Range: bytes=0-4' http://localhost:8000/range.php * About to connect() to localhost port 8000 (#0) * Trying ::1... * connected * Connected to localhost (::1) port 8000 (#0) > GET /range.php HTTP/1.1 > User-Agent: curl/7.24.0 > Host: localhost:8000 > Accept: */* > Range: bytes=0-4 > [Sun Aug 18 14:30:52 2013] ::1:59798 [206]: /range.php < HTTP/1.1 206 Partial Content < Host: localhost:8000 < Connection: close < X-Powered-By: PHP/5.4.9 < Content-Range: bytes 0-4/10 < Content-Type: text/plain < Content-Length: 5 < * Closing connection #0 01234 |
See how the status code is now 206 instead of 200, and there is a Content-Range HTTP header telling you what bytes were returned.
Or the last 5 bytes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
$ curl -v -H 'Range: bytes=-5' http://localhost:8000/range.php * About to connect() to localhost port 8000 (#0) * Trying ::1... * connected * Connected to localhost (::1) port 8000 (#0) > GET /range.php HTTP/1.1 > User-Agent: curl/7.24.0 > Host: localhost:8000 > Accept: */* > Range: bytes=-5 > [Sun Aug 18 14:30:33 2013] ::1:59796 [206]: /range.php < HTTP/1.1 206 Partial Content < Host: localhost:8000 < Connection: close < X-Powered-By: PHP/5.4.9 < Content-Range: bytes 5-9/10 < Content-Type: text/plain < Content-Length: 5 < * Closing connection #0 56789 |
The first 5 and the last 5 bytes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
$ curl -v -H 'Range: bytes=0-4,-5' http://localhost:8000/range.php * About to connect() to localhost port 8000 (#0) * Trying ::1... * connected * Connected to localhost (::1) port 8000 (#0) > GET /range.php HTTP/1.1 > User-Agent: curl/7.24.0 > Host: localhost:8000 > Accept: */* > Range: bytes=0-4,-5 > [Sun Aug 18 14:30:12 2013] ::1:59794 [206]: /range.php < HTTP/1.1 206 Partial Content < Host: localhost:8000 < Connection: close < X-Powered-By: PHP/5.4.9 < Content-Type: multipart/byteranges; boundary=ALLIeNOkvwgKk0ib91ZNph5qi8fHo2ai < Content-Length: 236 < --ALLIeNOkvwgKk0ib91ZNph5qi8fHo2ai Content-Type: text/plain Content-Range: bytes 0-4/10 01234 --ALLIeNOkvwgKk0ib91ZNph5qi8fHo2ai Content-Type: text/plain Content-Range: bytes 5-9/10 56789 * Closing connection #0 --ALLIeNOkvwgKk0ib91ZNph5qi8fHo2ai-- |
The Content-Type is switched from text/plain to multipart/byteranges; bound ary=ALLIeNOkvwgKk0ib91ZNph5qi8fHo2ai. The “real” Content headers have moved within each section.
Because this is the entire file, it’s also valid to serve it up as if you requested this without any Range header.
An invalid request, because bytes 20–24 do not exist:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
$ curl -v -H 'Range: bytes=20-24' http://localhost:8000/range.php * About to connect() to localhost port 8000 (#0) * Trying ::1... * connected * Connected to localhost (::1) port 8000 (#0) > GET /range.php HTTP/1.1 > User-Agent: curl/7.24.0 > Host: localhost:8000 > Accept: */* > Range: bytes=20-24 > [Sun Aug 18 14:32:17 2013] Client requested no valid ranges. [Sun Aug 18 14:32:17 2013] ::1:59806 [416]: /range.php < HTTP/1.1 416 Requested Range Not Satisfiable < Host: localhost:8000 < Connection: close < X-Powered-By: PHP/5.4.9 < Content-Range: bytes */10 < Content-type: text/html < * Closing connection #0 |
This returns a third status code, 416, along with a helpful header to let us know the legal set of values to request: Content-Range: bytes */10.
Finally, a legal and illegal value:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
$ curl -v -H 'Range: bytes=0-4,20-24' http://localhost:8000/range.php * About to connect() to localhost port 8000 (#0) * Trying ::1... * connected * Connected to localhost (::1) port 8000 (#0) > GET /range.php HTTP/1.1 > User-Agent: curl/7.24.0 > Host: localhost:8000 > Accept: */* > Range: bytes=0-4,20-24 > [Sun Aug 18 14:31:27 2013] ::1:59801 [206]: /range.php < HTTP/1.1 206 Partial Content < Host: localhost:8000 < Connection: close < X-Powered-By: PHP/5.4.9 < Content-Range: bytes 0-4/10 < Content-Type: text/plain < Content-Length: 5 < * Closing connection #0 01234 |
Because there’s at least one valid range, the illegal ones are ignored and the response is the same as only asking for the first 5 bytes.