Also:
- support per-directory and per-upper-directory whitelist entries.
- convert badlist input grep tweak into the above format.
(except for 'And' which had just a few hits.)
- fix many code exceptions, but do not enforce.
(there also remain about 350 'will' uses in lib)
- fix badwords in example code, drop exceptions.
- badwords-all: convert to Perl.
To make it usable from CMake.
- FAQ: reword to not use 'will'. Drop exception.
Closes#20886
- asyn-thrdd.c: scope an include.
- apply more clang-format suggestions.
- tidy-up PP guard comments.
- delete empty line from the top of headers.
- add empty line after `curl_setup.h` include where missing.
- fix indent.
- CODE_STYLE.md: add `strcpy`.
Follow-up to 8636ad55df#20088
- lib1901.c: drop unnecessary line.
Follow-up to 436e67f65b#20076Closes#20070
- curl_range: replace `sendf.h` with direct header dependency
`curl_trc.h`.
- drop `curl/curl.h` includes from internal sourcees in favor of the
include made from `curl_setup.h`. Replace it with the latter where
it's the only include.
- include `curl_setup.h` before using macros, where missing.
- drop redundant `stdlib.h`, `string.h` includes, in favor of
`curl_setup_once.h` including them.
- drop redundant `limits.h` in favor of `curl_setup.h` including it.
- fake_addrinfo.h: fix typo in comment.
- curl_setup_once.h: drop `stdio.h` in favor of earlier include in
`curl_setup.h`.
- drop stray, unused, `stddef.h` includes.
- memdebug.h: add missing `stddef.h` include. (relying on accidental
includes via other headers before this patch.)
- stddef.h: document why it's included.
- strerr: drop `curl/mprintf.h` in favor of `curl/curl.h` including it
via `curl_setup.h`.
Closes#20027
Use `data->progress.now` as the timestamp of proecssing a transfer.
Update it on significant events and refrain from calling `curlx_now()`
in many places.
The problem this addresses is
a) calling curlx_now() has costs, depending on platform. Calling it
every time results in 25% increase `./runtest` duration on macOS.
b) we used to pass a `struct curltime *` around to save on calls, but
when some method directly use `curx_now()` and some use the passed
pointer, the transfer experienes non-linear time. This results in
timeline checks to report events in the wrong order.
By keeping a timestamp in the easy handle and updating it there, no
longer invoking `curlx_now()` in the "lower" methods, the transfer
can observer a steady clock progression.
Add documentation in docs/internals/TIME-KEEPING.md
Reported-by: Viktor Szakats
Fixes#19935Closes#19961
New multi option CURLMOPT_NETWORK_CHANGED with a long bitmask value:
- CURLM_NWCOPT_CLEAR_CONNS: do not reuse existing connections, close all
idle connections.
- CURLM_NWCOPT_CLEAR_DNS: clear the multi's DNS cache.
All other bits reserved for future extensions.
Fixes#17225
Reported-by: ウさん
Closes#17613
Move curlx_ functions into its own subdir.
The idea is to use the curlx_ prefix proper on these functions, and use
these same function names both in tool, lib and test suite source code.
Stop the previous special #define setup for curlx_ names.
The printf defines are now done for the library alone. Tests no longer
use the printf defines. The tool code sets its own defines. The printf
functions are not curlx, they are publicly available.
The strcase defines are not curlx_ functions and should not be used by
tool or server code.
dynbuf, warnless, base64, strparse, timeval, timediff are now proper
curlx functions.
When libcurl is built statically, the functions from the library can be
used as-is. The key is then that the functions must work as-is, without
having to be recompiled for use in tool/tests. This avoids symbol
collisions - when libcurl is built statically, we use those functions
directly when building the tool/tests. When libcurl is shared, we
build/link them separately for the tool/tests.
Assisted-by: Jay Satiro
Closes#17253
The callback, provided from url.c did the work that the cshutdn
functionality also implemented. Remove it.
Change some DEBUGF(infof()) to CURL_TRC_M().
Closes#16810
Further testing with timeouts in event based processing revealed that
our current shutdown handling in the connection pool was not clear
enough. Graceful shutdowns can only happen inside a multi handle and it
was confusing to track in the code which situation actually applies. It
seems better to split the shutdown handling off and have that code
always be part of a multi handle.
Add `cshutdn.[ch]` with its own struct to maintain connections being
shut down. A `cshutdn` always belongs to a multi handle and uses that
for socket/timeout monitoring.
The `cpool`, which can be part of a multi or share, either passes
connections to a `cshutdn` or terminates them with a one-time, best
effort.
Add an `admin` easy handle to each multi and share. This is used to
perform all maintenance operations where no "real" easy handle is
available. This solves the problem that the multi admin handle requires
some additional initialisation (e.g. timeout list).
The share needs its admin handle as it is often cleaned up when no other
transfer or multi handle exists any more. But we need a `data` in almost
every call.
Fix file:// handling of errors when adding a new connection to the pool.
Changes in `curl` itself:
- for parallel transfers, do not set a connection pool in the share,
rely on the multi's connection pool instead. While not a requirement
for the new `cshutdn` to work, this is
a) helpful in testing to trigger graceful shutdowns
b) a broader code coverage of libcurl via the curl tool
- on test_event with uv, cleanup the multi handle before returning from
parallel_event(). The uv struct is on the stack, cleanup of the multi
later will crash when it tries to register sockets. This is a "eat
your own dogfood" related fix.
Closes#16508
Rework the event based handling of transfers and connections to
be "localized" into a single source file with clearer dependencies.
- add multi_ev.c and multi_ev.h
- add docs/internal/MULTI-EV.md to explain the overall workings
- only do event handling book keeping when the socket callback
is set
- add handling for "connection only" event tracking, when internal
easy handles are used that are not really tied to a connection.
Used in connection pool.
- remove transfer member "last_poll" and connections "shutdown_poll"
and keep all that internal to multi_ev.c
- add CURL_TRC_M() for tracing of "multi" related things, including
event handling and connection pool operations. Add new trace
feature "multi" for trace config.
multi traces will show exactly what is going on in regard to
event handling.
- multi: trace transfers "mstate" in every CURL_TRC_M() call
- make internal trace buffer 2048 bytes and end the silliness
with +n here -m there. Adjust test 1652 expectations of resulting
length and input edge cases.
- add trace feature "lib-ids" to perfix libcurl traces with transfer
and connection ids. Useful for debugging libcurl applications.
Closes#16308
- Make curl_multi_waitfds consistent with the documentation.
Issue Addressed:
- The documentation of curl_multi_waitfds indicates that users should
be able to call curl_multi_waitfds with a NULL ufds. However, before
this change, the function would return CURLM_BAD_FUNCTION_ARGUMENT.
- Additionally, the documentation suggests that users can use this
function to determine the number of file descriptors (fds) needed.
However, the function would stop counting fds if the supplied fds
were exhausted.
Changes Made:
- NULL ufds Handling: curl_multi_waitfds can now accept a NULL ufds if
size is also zero.
- Counting File Descriptors: If curl_multi_waitfds is passed a NULL
ufds, or the size of ufds is insufficient, the output parameter
fd_count will return the number of fds needed. This value may be
higher than actually needed but never lower.
Testing:
- Test 2405 has been updated to cover the usage scenarios described
above.
Fixes https://github.com/curl/curl/issues/15146
Closes https://github.com/curl/curl/pull/15155
This is a better match for what they do and the general "cpool"
var/function prefix works well.
The pool now handles very long hostnames correctly.
The following changes have been made:
* 'struct connectdata', e.g. connections, keep new members
named `destination` and ' destination_len' that fully specifies
interface+port+hostname of where the connection is going to.
This is used in the pool for "bundling" of connections with
the same destination. There is no limit on the length any more.
* Locking: all locks are done inside conncache.c when calling
into the pool and released on return. This eliminates hazards
of the callers keeping track.
* 'struct connectbundle' is now internal to the pool. It is no
longer referenced by a connection.
* 'bundle->multiuse' no longer exists. HTTP/2 and 3 and TLS filters
no longer need to set it. Instead, the multi checks on leaving
MSTATE_CONNECT or MSTATE_CONNECTING if the connection is now
multiplexed and new, e.g. not conn->bits.reuse. In that case
the processing of pending handles is triggered.
* The pool's init is provided with a callback to invoke on all
connections being discarded. This allows the cleanups in
`Curl_disconnect` to run, wherever it is decided to retire
a connection.
* Several pool operations can now be fully done with one call.
Pruning dead connections, upkeep and checks on pool limits
can now directly discard connections and need no longer return
those to the caller for doing that (as we have now the callback
described above).
* Finding a connection for reuse is now done via `Curl_cpool_find()`
and the caller provides callbacks to evaluate the connection
candidates.
* The 'Curl_cpool_check_limits()' now directly uses the max values
that may be set in the transfer's multi. No need to pass them
around. Curl_multi_max_host_connections() and
Curl_multi_max_total_connections() are gone.
* Add method 'Curl_node_llist()' to get the llist a node is in.
Used in cpool to verify connection are indeed in the list (or
not in any list) as they need to.
I left the conncache.[ch] as is for now and also did not touch the
documentation. If we update that outside the feature window, we can
do this in a separate PR.
Multi-thread safety is not achieved by this PR, but since more details
on how pools operate are now "internal" it is a better starting
point to go for this in the future.
Closes#14662
Based on the standards and guidelines we use for our documentation.
- expand contractions (they're => they are etc)
- host name = > hostname
- file name => filename
- user name = username
- man page => manpage
- run-time => runtime
- set-up => setup
- back-end => backend
- a HTTP => an HTTP
- Two spaces after a period => one space after period
Closes#14073
When libcurl discards a connection there are two phases this may go
through: "shutdown" and "closing". If a connection is aborted, the
shutdown phase is skipped and it is closed right away.
The connection filters attached to the connection implement the phases
in their `do_shutdown()` and `do_close()` callbacks. Filters carry now a
`shutdown` flags next to `connected` to keep track of the shutdown
operation.
Filters are shut down from top to bottom. If a filter is not connected,
its shutdown is skipped. Notable filters that *do* something during
shutdown are HTTP/2 and TLS. HTTP/2 sends the GOAWAY frame. TLS sends
its close notify and expects to receive a close notify from the server.
As sends and receives may EAGAIN on the network, a shutdown is often not
successful right away and needs to poll the connection's socket(s). To
facilitate this, such connections are placed on a new shutdown list
inside the connection cache.
Since managing this list requires the cooperation of a multi handle,
only the connection cache belonging to a multi handle is used. If a
connection was in another cache when being discarded, it is removed
there and added to the multi's cache. If no multi handle is available at
that time, the connection is shutdown and closed in a one-time,
best-effort attempt.
When a multi handle is destroyed, all connection still on the shutdown
list are discarded with a final shutdown attempt and close. In curl
debug builds, the environment variable `CURL_GRACEFUL_SHUTDOWN` can be
set to make this graceful with a timeout in milliseconds given by the
variable.
The shutdown list is limited to the max number of connections configured
for a multi cache. Set via CURLMOPT_MAX_TOTAL_CONNECTIONS. When the
limit is reached, the oldest connection on the shutdown list is
discarded.
- In multi_wait() and multi_waitfds(), collect all connection caches
involved (each transfer might carry its own) into a temporary list.
Let each connection cache on the list contribute sockets and
POLLIN/OUT events it's connections are waiting for.
- in multi_perform() collect the connection caches the same way and let
them peform their maintenance. This will make another non-blocking
attempt to shutdown all connections on its shutdown list.
- for event based multis (multi->socket_cb set), add the sockets and
their poll events via the callback. When `multi_socket()` is invoked
for a socket not known by an active transfer, forward this to the
multi's cache for processing. On closing a connection, remove its
socket(s) via the callback.
TLS connection filters MUST NOT send close nofity messages in their
`do_close()` implementation. The reason is that a TLS close notify
signals a success. When a connection is aborted and skips its shutdown
phase, the server needs to see a missing close notify to detect
something has gone wrong.
A graceful shutdown of FTP's data connection is performed implicitly
before regarding the upload/download as complete and continuing on the
control connection. For FTP without TLS, there is just the socket close
happening. But with TLS, the sent/received close notify signals that the
transfer is complete and healthy. Servers like `vsftpd` verify that and
reject uploads without a TLS close notify.
- added test_19_* for shutdown related tests
- test_19_01 and test_19_02 test for TCP RST packets
which happen without a graceful shutdown and should
no longer appear otherwise.
- add test_19_03 for handling shutdowns by the server
- add test_19_04 for handling shutdowns by curl
- add test_19_05 for event based shutdowny by server
- add test_30_06/07 and test_31_06/07 for shutdown checks
on FTP up- and downloads.
Closes#13976
`CURLDEBUG` is meant to enable memory tracking, but in a bunch of cases,
it was protecting debug features that were supposed to be guarded with
`DEBUGBUILD`.
Replace these uses with `DEBUGBUILD`.
This leaves `CURLDEBUG` uses solely for its intended purpose: to enable
the memory tracking debug feature.
Also:
- autotools: rely on `DEBUGBUILD` to enable `checksrc`.
Instead of `CURLDEBUG`, which worked in most cases because debug
builds enable `CURLDEBUG` by default, but it's not accurate.
- include `lib/easyif.h` instead of keeping a copy of a declaration.
- add CI test jobs for the build issues discovered.
Ref: https://github.com/curl/curl/pull/13694#issuecomment-2120311894Closes#13718
- add an `id` long to Curl_easy, -1 on init
- once added to a multi (or its own multi), it gets
a non-negative number assigned by the connection cache
- `id` is unique among all transfers using the same
cache until reaching LONG_MAX where it will wrap
around. So, not unique eternally.
- CURLINFO_CONN_ID returns the connection id attached to
data or, if none present, data->state.lastconnect_id
- variables and type declared in tool for write out
Closes#11185
- they are mostly pointless in all major jurisdictions
- many big corporations and projects already don't use them
- saves us from pointless churn
- git keeps history for us
- the year range is kept in COPYING
checksrc is updated to allow non-year using copyright statements
Closes#10205
Add licensing and copyright information for all files in this repository. This
either happens in the file itself as a comment header or in the file
`.reuse/dep5`.
This commit also adds a Github workflow to check pull requests and adapts
copyright.pl to the changes.
Closes#8869
To simplify, and also since the returned name is not the full actual
name used for the check. The port number and zone id is also involved,
so just showing the name is misleading.
Closes#8750
... in most cases instead of 'struct connectdata *' but in some cases in
addition to.
- We mostly operate on transfers and not connections.
- We need the transfer handle to log, store data and more. Everything in
libcurl is driven by a transfer (the CURL * in the public API).
- This work clarifies and separates the transfers from the connections
better.
- We should avoid "conn->data". Since individual connections can be used
by many transfers when multiplexing, making sure that conn->data
points to the current and correct transfer at all times is difficult
and has been notoriously error-prone over the years. The goal is to
ultimately remove the conn->data pointer for this reason.
Closes#6425
More connection cache accesses are protected by locks.
CONNCACHE_* is a beter prefix for the connection cache lock macros.
Curl_attach_connnection: now called as soon as there's a connection
struct available and before the connection is added to the connection
cache.
Curl_disconnect: now assumes that the connection is already removed from
the connection cache.
Ref: #4915Closes#5009
It could accidentally let the connection get used by more than one
thread, leading to double-free and more.
Reported-by: Christopher Reid
Fixes#4544Closes#4557
Only HTTP proxy use where multiple host names can be used over the same
connection should use the proxy host name for bundles.
Reported-by: Tom van der Woerdt
Fixes#3951Closes#3955
Do not assume/store assocation between a given easy handle and the
connection if it can be avoided.
Long-term, the 'conn->data' pointer should probably be removed as it is a
little too error-prone. Still used very widely though.
Reported-by: masbug on github
Fixes#3391Closes#3400
If the lock is released before the dealings with the bundle is over, it may
have changed by another thread in the mean time.
Fixes#2132Fixes#2151Closes#2139
... to make all libcurl internals able to use the same data types for
the struct members. The timeval struct differs subtly on several
platforms so it makes it cumbersome to use everywhere.
Ref: #1652Closes#1693
to allow code to act differently on the situation.
Also added some more info message for the connection re-use function to
make it clearer when connections are not re-used.
All the existing Curl_bundle* functions were only ever used from within
the conncache.c file, so I moved them over and made them static (and
removed the Curl_ prefix).