Add protocol.h and protocol.c containing all about libcurl's
known URI schemes and their protocol handlers (so they exist).
Moves the scheme definitions from the various sources files into
protocol.c. Schemes are known and used, even of the protocol
handler is not build or just not implemented at all.
Closes#20906
The protocol handler method `connection_check` allowed to variable
operations to trigger with variable result bits. Only the `CONNCHECK_ISDEAD`
and `CONNRESULT_DEAD` were in use. Transform the function into
`connection_is_dead` without extra parameter and a bool result.
- Remove defines for `CONNCHECK_*` and `CONNRESULT_*`
- Rename protocol function in handler comments
- Change RTSP implementation (only protocol that uses this)
Closes#20890
And a few variables around.
There remain cases where the accepted pointer is const, yet the returned
pointer is written to.
Partly addressing (glibc 2.43):
```
* For ISO C23, the functions bsearch, memchr, strchr, strpbrk, strrchr,
strstr, wcschr, wcspbrk, wcsrchr, wcsstr and wmemchr that return
pointers into their input arrays now have definitions as macros that
return a pointer to a const-qualified type when the input argument is
a pointer to a const-qualified type.
```
Ref: https://lists.gnu.org/archive/html/info-gnu/2026-01/msg00005.html
Reported-by: Rudi Heitbaum
Ref: #20420Closes#20421
This allows builds know about all schemes - but only have the protocol
implementations for those actually built-in.
It further allows multiple protocols to reuse the same protocol setup
and functions for both TLS and non-TLS implementations instead of
needing two (or more) structs.
The scheme information is now in 'struct Curl_scheme' and all the
function pointers for each scheme/protocol implementation are in struct
Curl_protocol.
The URL API now always work with all known protocols.
Closes#20351
For protocols: dict, file, gopher, tftp, http, mqtt, smb.
Move protocol hander table to the end of sources, rearrange static
functions is reverse dependency order as necessary.
Closes#20274
- asyn-thrdd.c: scope an include.
- apply more clang-format suggestions.
- tidy-up PP guard comments.
- delete empty line from the top of headers.
- add empty line after `curl_setup.h` include where missing.
- fix indent.
- CODE_STYLE.md: add `strcpy`.
Follow-up to 8636ad55df#20088
- lib1901.c: drop unnecessary line.
Follow-up to 436e67f65b#20076Closes#20070
- replace `sendf.h` with `curl_trc.h` where it was included just for it.
- drop unused `curl_trc.h` includes.
- easy: delete obsolete comment about `send.h` include reason.
Also:
- move out `curl_trc.h` include from `sendf.h` and include it directly
in users, where not done already. To flatten the include tree and
to less rely on indirect includes.
- stop including `sendf.h` from other headers, replace it with forward
declaration of `Curl_easy`, as done already elsewhere.
Verified with an all non-unity CI run.
Closes#20061
- curl_range: replace `sendf.h` with direct header dependency
`curl_trc.h`.
- drop `curl/curl.h` includes from internal sourcees in favor of the
include made from `curl_setup.h`. Replace it with the latter where
it's the only include.
- include `curl_setup.h` before using macros, where missing.
- drop redundant `stdlib.h`, `string.h` includes, in favor of
`curl_setup_once.h` including them.
- drop redundant `limits.h` in favor of `curl_setup.h` including it.
- fake_addrinfo.h: fix typo in comment.
- curl_setup_once.h: drop `stdio.h` in favor of earlier include in
`curl_setup.h`.
- drop stray, unused, `stddef.h` includes.
- memdebug.h: add missing `stddef.h` include. (relying on accidental
includes via other headers before this patch.)
- stddef.h: document why it's included.
- strerr: drop `curl/mprintf.h` in favor of `curl/curl.h` including it
via `curl_setup.h`.
Closes#20027
Before this patch curl used the C preprocessor to override standard
memory allocation symbols: malloc, calloc, strdup, realloc, free.
The goal of these is to replace them with curl's debug wrappers in
`CURLDEBUG` builds, another was to replace them with the wrappers
calling user-defined allocators in libcurl. This solution needed a bunch
of workarounds to avoid breaking external headers: it relied on include
order to do the overriding last. For "unity" builds it needed to reset
overrides before external includes. Also in test apps, which are always
built as single source files. It also needed the `(symbol)` trick
to avoid overrides in some places. This would still not fix cases where
the standard symbols were macros. It was also fragile and difficult
to figure out which was the actual function behind an alloc or free call
in a specific piece of code. This in turn caused bugs where the wrong
allocator was accidentally called.
To avoid these problems, this patch replaces this solution with
`curlx_`-prefixed allocator macros, and mapping them _once_ to either
the libcurl wrappers, the debug wrappers or the standard ones, matching
the rest of the code in libtests.
This concludes the long journey to avoid redefining standard functions
in the curl codebase.
Note: I did not update `packages/OS400/*.c` sources. They did not
`#include` `curl_setup.h`, `curl_memory.h` or `memdebug.h`, meaning
the overrides were never applied to them. This may or may not have been
correct. For now I suppressed the direct use of standard allocators
via a local `.checksrc`. Probably they (except for `curlcl.c`) should be
updated to include `curl_setup.h` and use the `curlx_` macros.
This patch changes mappings in two places:
- `lib/curl_threads.c` in libtests: Before this patch it mapped to
libcurl allocators. After, it maps to standard allocators, like
the rest of libtests code.
- `units`: before this patch it mapped to standard allocators. After, it
maps to libcurl allocators.
Also:
- drop all position-dependent `curl_memory.h` and `memdebug.h` includes,
and delete the now unnecessary headers.
- rename `Curl_tcsdup` macro to `curlx_tcsdup` and define like the other
allocators.
- map `curlx_strdup()` to `_strdup()` on Windows (was: `strdup()`).
To fix warnings silenced via `_CRT_NONSTDC_NO_DEPRECATE`.
- multibyte: map `curlx_convert_*()` to `_strdup()` on Windows
(was: `strdup()`).
- src: do not reuse the `strdup` name for the local replacement.
- lib509: call `_strdup()` on Windows (was: `strdup()`).
- test1132: delete test obsoleted by this patch.
- CHECKSRC.md: update text for `SNPRINTF`.
- checksrc: ban standard allocator symbols.
Follow-up to b12da22db1#18866
Follow-up to db98daab05#18844
Follow-up to 4deea9396b#18814
Follow-up to 9678ff5b1b#18776
Follow-up to 10bac43b87#18774
Follow-up to 20142f5d06#18634
Follow-up to bf7375ecc5#18503
Follow-up to 9863599d69#18502
Follow-up to 3bb5e58c10#17827Closes#19626
Add new functions in `curlx/warnless.h` for controlled type
conversions:
* curlx_uitouz, convert unsigned into to size_t (should always work)
* curlx_uztoso, convert size_t to curl_off_t, capping at CURL_OFF_T_MAX
* curlx_sztouz, convert ssize_t to size_t, return TRUE when ok
* curlx_sotouz_range, convert curl_off_t to size_t interval, capping
values to interval bounds
Remove some unnecesary casts, convert some internal recv functions
to the "return result, have size_t* arg" pattern.
Closes#19495
After this patch, the codebase no longer overrides system printf
functions. Instead it explicitly calls either the curl printf functions
`curl_m*printf()` or the system ones using their original names.
Also:
- drop unused `curl_printf.h` includes.
- checksrc: ban system printf functions, allow where necessary.
Follow-up to db98daab05#18844
Follow-up to 4deea9396b#18814Closes#18866
Deduce that the transfer response expects headers by the protocol
handler implementing `write_resp_hd` callback. This eleminates the
`getheader` parameter in the `Curl_xfer_setup_*()` methods.
Add an implementation to RTSP for `write_resp_hd`, joining the HTTP
protocol in the only handlers having it.
Reverse the default of request's `header` bit that signals that headers
are expected. Default is now FALSE, set to TRUE when setting up the
transfer by presence of `write_resp_hd` in the protocol handler.
Closes#18218
Make variants for transfers that send/receive or do both with just the
parameters they need. Split out the shutdown setting into a separate
function. Only FTP bothers with that.
Closes#18203
`getsock()` calls operated on a global limit that could
not be configure beyond 16 sockets. This is no longer adequate
with the new happy eyeballing strategy.
Instead, do the following:
- make `struct easy_pollset` dynamic. Starting with
a minimal room for two sockets, the very common case,
allow it to grow on demand.
- replace all protocol handler getsock() calls with pollsets
and a CURLcode to return failures
- add CURLcode return for all connection filter `adjust_pollset()`
callbacks, since they too can now fail.
- use appropriately in multi.c and multi_ev.c
- fix unit2600 to trigger pollset growth
Closes#18164
Drop `strcasecompare` and `strncasecompare` in favor of libcurl API
calls `curl_strequal` and `curl_strnequal` respectively.
Also drop unnecessary `strcase.h` includes. Include `curl/curl.h`
instead where it wasn't included before.
Closes#17772
Move curlx_ functions into its own subdir.
The idea is to use the curlx_ prefix proper on these functions, and use
these same function names both in tool, lib and test suite source code.
Stop the previous special #define setup for curlx_ names.
The printf defines are now done for the library alone. Tests no longer
use the printf defines. The tool code sets its own defines. The printf
functions are not curlx, they are publicly available.
The strcase defines are not curlx_ functions and should not be used by
tool or server code.
dynbuf, warnless, base64, strparse, timeval, timediff are now proper
curlx functions.
When libcurl is built statically, the functions from the library can be
used as-is. The key is then that the functions must work as-is, without
having to be recompiled for use in tool/tests. This avoids symbol
collisions - when libcurl is built statically, we use those functions
directly when building the tool/tests. When libcurl is shared, we
build/link them separately for the tool/tests.
Assisted-by: Jay Satiro
Closes#17253
The issues found fell into these categories, with the applied fixes:
- const was accidentally stripped.
Adjust code to not cast or cast with const.
- const/volatile missing from arguments, local variables.
Constify arguments or variables, adjust/delete casts. Small code
changes in a few places.
- const must be stripped because an API dependency requires it.
Strip `const` with `CURL_UNCONST()` macro to silence the warning out
of our control. These happen at API boundaries. Sometimes they depend
on dependency version, which this patch handles as necessary. Also
enable const support for the zlib API, using `ZLIB_CONST`. Supported
by zlib 1.2.5.2 and newer.
- const must be stripped because a curl API requires it.
Strip `const` with `CURL_UNCONST()` macro to silence the warning out
of our immediate control. For example we promise to send a non-const
argument to a callback, though the data is const internally.
- other cases where we may avoid const stripping by code changes.
Also silenced with `CURL_UNCONST()`.
- there are 3 places where `CURL_UNCONST()` is cast again to const.
To silence this type of warning:
```
lib/vquic/curl_osslq.c:1015:29: error: to be safe all intermediate
pointers in cast from 'unsigned char **' to 'const unsigned char **'
must be 'const' qualified [-Werror=cast-qual]
lib/cf-socket.c:734:32: error: to be safe all intermediate pointers in
cast from 'char **' to 'const char **' must be 'const' qualified
[-Werror=cast-qual]
```
There may be a better solution, but I couldn't find it.
These cases are handled in separate subcommits, but without further
markup.
If you see a `-Wcast-qual` warning in curl, we appreciate your report
about it.
Closes#16142
Adds a `follow()` callback to protocol handlers, so they may decide how
to act on a `newurl` after a request has been done. This is optional.
This moves the HTTP code for handling redirects from multi.c to http.c
where it should be. If we ever add a protocol with its own logic, it
would install its own follow function.
Closes#16075
Adds a `bool eos` flag to send methods to indicate that the data
is the last chunk the invovled transfer wants to send to the server.
This will help protocol filters like HTTP/2 and 3 to forward the
stream's EOF flag and also allow to EAGAIN such calls when buffers
are not yet fully flushed.
Closes#14220
Adds a `bool eos` flag to send methods to indicate that the data is the
last chunk the invovled transfer wants to send to the server.
This will help protocol filters like HTTP/2 and 3 to forward the
stream's EOF flag and also allow to EAGAIN such calls when buffers are
not yet fully flushed.
Closes#14220
- clarify Curl_xfer_setup() with RECV/SEND flags and different calls for
which socket they operate on. Add a shutdown flag for secondary
sockets
- change Curl_xfer_setup() calls to new functions
- implement non-blocking connection shutdown at the end of receiving or
sending a transfer
Closes#13913
- replace `Curl_read()`, `Curl_write()` and `Curl_nwrite()` to
clarify when and at what level they operate
- send/recv of transfer related data is now done via
`Curl_xfer_send()/Curl_xfer_recv()` which no longer has
socket/socketindex as parameter. It decides on the transfer
setup of `conn->sockfd` and `conn->writesockfd` on which
connection filter chain to operate.
- send/recv on a specific connection filter chain is done via
`Curl_conn_send()/Curl_conn_recv()` which get the socket index
as parameter.
- rename `Curl_setup_transfer()` to `Curl_xfer_setup()` for
naming consistency
- clarify that the special CURLE_AGAIN hangling to return
`CURLE_OK` with length 0 only applies to `Curl_xfer_send()`
and CURLE_AGAIN is returned by all other send() variants.
- fix a bug in websocket `curl_ws_recv()` that mixed up data
when it arrived in more than a single chunk
The method for sending not just raw bytes, but bytes that are either
"headers" or "body". The send abstraction stack, to to bottom, now is:
* `Curl_req_send()`: has parameter to indicate amount of header bytes,
buffers all data.
* `Curl_xfer_send()`: knows on which socket index to send, returns
amount of bytes sent.
* `Curl_conn_send()`: called with socket index, returns amount of bytes
sent.
In addition there is `Curl_req_flush()` for writing out all buffered
bytes.
`Curl_req_send()` is active for requests without body,
`Curl_buffer_send()` still being used for others. This is because the
special quirks need to be addressed in future parts:
* `expect-100` handling
* `Curl_fillreadbuffer()` needs to add directly to the new
`data->req.sendbuf`
* special body handlings, like `chunked` encodings and line end
conversions will be moved into something like a Client Reader.
In functions of the pattern `CURLcode xxx_send(..., ssize_t *written)`,
replace the `ssize_t` with a `size_t`. It makes no sense to allow for negative
values as the returned `CURLcode` already specifies error conditions. This
allows easier handling of lengths without casting.
Closes#12964
Curl_read/Curl_write clarifications
- replace `Curl_read()`, `Curl_write()` and `Curl_nwrite()` to 1clarify
when and at what level they operate
- send/recv of transfer related data is now done via
`Curl_xfer_send()/Curl_xfer_recv()` which no longer has
socket/socketindex as parameter. It decides on the transfer setup of
`conn->sockfd` and `conn->writesockfd` on which connection filter
chain to operate.
- send/recv on a specific connection filter chain is done via
`Curl_conn_send()/Curl_conn_recv()` which get the socket index as
parameter.
- rename `Curl_setup_transfer()` to `Curl_xfer_setup()` for naming
consistency
- clarify that the special CURLE_AGAIN handling to return `CURLE_OK`
with length 0 only applies to `Curl_xfer_send()` and CURLE_AGAIN is
returned by all other send() variants.
SingleRequest reshuffling
- move functions into request.[ch]
- differentiate between reset and free
- add Curl_req_done() to perform last actions
- add a send `bufq` to SingleRequest for future use in keeping upload data
Closes#12963
- delete redundant warning suppressions for `-Wformat-nonliteral`.
This now relies on `CURL_PRINTF()` and it's theoratically possible
that this macro isn't active but the warning is. We're ignoring this
as a corner-case here.
- replace two pragmas with code changes to avoid the warnings.
Follow-up to aee4ebe591#12803
Follow-up to 0923012758#12540
Follow-up to 3829759bd0#12489
Reviewed-by: Daniel Stenberg
Closes#12812
This clarifies the handling of server responses by folding the code for
the complicated protocols into their protocol handlers. This concerns
mainly HTTP and its bastard sibling RTSP.
The terms "read" and "write" are often used without clear context if
they refer to the connect or the client/application side of a
transfer. This PR uses "read/write" for operations on the client side
and "send/receive" for the connection, e.g. server side. If this is
considered useful, we can revisit renaming of further methods in another
PR.
Curl's protocol handler `readwrite()` method been changed:
```diff
- CURLcode (*readwrite)(struct Curl_easy *data, struct connectdata *conn,
- const char *buf, size_t blen,
- size_t *pconsumed, bool *readmore);
+ CURLcode (*write_resp)(struct Curl_easy *data, const char *buf, size_t blen,
+ bool is_eos, bool *done);
```
The name was changed to clarify that this writes reponse data to the
client side. The parameter changes are:
* `conn` removed as it always operates on `data->conn`
* `pconsumed` removed as the method needs to handle all data on success
* `readmore` removed as no longer necessary
* `is_eos` as indicator that this is the last call for the transfer
response (end-of-stream).
* `done` TRUE on return iff the transfer response is to be treated as
finished
This change affects many files only because of updated comments in
handlers that provide no implementation. The real change is that the
HTTP protocol handlers now provide an implementation.
The HTTP protocol handlers `write_resp()` implementation will get passed
**all** raw data of a server response for the transfer. The HTTP/1.x
formatted status and headers, as well as the undecoded response
body. `Curl_http_write_resp_hds()` is used internally to parse the
response headers and pass them on. This method is public as the RTSP
protocol handler also uses it.
HTTP/1.1 "chunked" transport encoding is now part of the general
*content encoding* writer stack, just like other encodings. A new flag
`CLIENTWRITE_EOS` was added for the last client write. This allows
writers to verify that they are in a valid end state. The chunked
decoder will check if it indeed has seen the last chunk.
The general response handling in `transfer.c:466` happens in function
`readwrite_data()`. This mainly operates now like:
```
static CURLcode readwrite_data(data, ...)
{
do {
Curl_xfer_recv_resp(data, buf)
...
Curl_xfer_write_resp(data, buf)
...
} while(interested);
...
}
```
All the response data handling is implemented in
`Curl_xfer_write_resp()`. It calls the protocol handler's `write_resp()`
implementation if available, or does the default behaviour.
All raw response data needs to pass through this function. Which also
means that anyone in possession of such data may call
`Curl_xfer_write_resp()`.
Closes#12480
https://best.openssf.org/Compiler-Hardening-Guides/Compiler-Options-Hardening-Guide-for-C-and-C++.html
as of 2023-11-29 [1].
Enable new recommended warnings (except `-Wsign-conversion`):
- enable `-Wformat=2` for clang (in both cmake and autotools).
- add `CURL_PRINTF()` internal attribute and mark functions accepting
printf arguments with it. This is a copy of existing
`CURL_TEMP_PRINTF()` but using `__printf__` to make it compatible
with redefinting the `printf` symbol:
https://gcc.gnu.org/onlinedocs/gcc-3.0.4/gcc_5.html#SEC94
- fix `CURL_PRINTF()` and existing `CURL_TEMP_PRINTF()` for
mingw-w64 and enable it on this platform.
- enable `-Wimplicit-fallthrough`.
- enable `-Wtrampolines`.
- add `-Wsign-conversion` commented with a FIXME.
- cmake: enable `-pedantic-errors` the way we do it with autotools.
Follow-up to d5c0351055#2747
- lib/curl_trc.h: use `CURL_FORMAT()`, this also fixes it to enable format
checks. Previously it was always disabled due to the internal `printf`
macro.
Fix them:
- fix bug where an `set_ipv6_v6only()` call was missed in builds with
`--disable-verbose` / `CURL_DISABLE_VERBOSE_STRINGS=ON`.
- add internal `FALLTHROUGH()` macro.
- replace obsolete fall-through comments with `FALLTHROUGH()`.
- fix fallthrough markups: Delete redundant ones (showing up as
warnings in most cases). Add missing ones. Fix indentation.
- silence `-Wformat-nonliteral` warnings with llvm/clang.
- fix one `-Wformat-nonliteral` warning.
- fix new `-Wformat` and `-Wformat-security` warnings.
- fix `CURL_FORMAT_SOCKET_T` value for mingw-w64. Also move its
definition to `lib/curl_setup.h` allowing use in `tests/server`.
- lib: fix two wrongly passed string arguments in log outputs.
Co-authored-by: Jay Satiro
- fix new `-Wformat` warnings on mingw-w64.
[1] 56c0fde389/docs/Compiler-Hardening-Guides/Compiler-Options-Hardening-Guide-for-C-and-C%2B%2B.mdCloses#12489
Out of 415 labels throughout the code base, 86 of those labels were
not at the start of the line. Which means labels always at the start of
the line is the favoured style overall with 329 instances.
Out of the 86 labels not at the start of the line:
* 75 were indented with the same indentation level of the following line
* 8 were indented with exactly one space
* 2 were indented with one fewer indentation level then the following
line
* 1 was indented with the indentation level of the following line minus
three space (probably unintentional)
Co-Authored-By: Viktor Szakats
Closes#11134
- they are mostly pointless in all major jurisdictions
- many big corporations and projects already don't use them
- saves us from pointless churn
- git keeps history for us
- the year range is kept in COPYING
checksrc is updated to allow non-year using copyright statements
Closes#10205
Add licensing and copyright information for all files in this repository. This
either happens in the file itself as a comment header or in the file
`.reuse/dep5`.
This commit also adds a Github workflow to check pull requests and adapts
copyright.pl to the changes.
Closes#8869
- the data needs to be "line-based" anyway since it's also passed to the
debug callback/application
- it makes infof() work like failf() and consistency is good
- there's an assert that triggers on newlines in the format string
- Also removes a few instances of "..."
- Removes the code that would append "..." to the end of the data *iff*
it was truncated in infof()
Closes#7357
The libssh2 backend has SSH session associated with the connection but
the callback context is the easy handle, so when a connection gets
attached to a transfer, the protocol handler now allows for a custom
function to get used to set things up correctly.
Reported-by: Michael O'Farrell
Fixes#6898Closes#7078
... in most cases instead of 'struct connectdata *' but in some cases in
addition to.
- We mostly operate on transfers and not connections.
- We need the transfer handle to log, store data and more. Everything in
libcurl is driven by a transfer (the CURL * in the public API).
- This work clarifies and separates the transfers from the connections
better.
- We should avoid "conn->data". Since individual connections can be used
by many transfers when multiplexing, making sure that conn->data
points to the current and correct transfer at all times is difficult
and has been notoriously error-prone over the years. The goal is to
ultimately remove the conn->data pointer for this reason.
Closes#6425