Import Upstream version 2.7.18

This commit is contained in:
geos_one
2025-08-15 16:28:06 +02:00
commit ba1f69ab39
4521 changed files with 1778434 additions and 0 deletions

View File

@@ -0,0 +1,607 @@
.. _compound:
*******************
Compound statements
*******************
.. index:: pair: compound; statement
Compound statements contain (groups of) other statements; they affect or control
the execution of those other statements in some way. In general, compound
statements span multiple lines, although in simple incarnations a whole compound
statement may be contained in one line.
The :keyword:`if`, :keyword:`while` and :keyword:`for` statements implement
traditional control flow constructs. :keyword:`try` specifies exception
handlers and/or cleanup code for a group of statements. Function and class
definitions are also syntactically compound statements.
.. index::
single: clause
single: suite
Compound statements consist of one or more 'clauses.' A clause consists of a
header and a 'suite.' The clause headers of a particular compound statement are
all at the same indentation level. Each clause header begins with a uniquely
identifying keyword and ends with a colon. A suite is a group of statements
controlled by a clause. A suite can be one or more semicolon-separated simple
statements on the same line as the header, following the header's colon, or it
can be one or more indented statements on subsequent lines. Only the latter
form of suite can contain nested compound statements; the following is illegal,
mostly because it wouldn't be clear to which :keyword:`if` clause a following
:keyword:`else` clause would belong: ::
if test1: if test2: print x
Also note that the semicolon binds tighter than the colon in this context, so
that in the following example, either all or none of the :keyword:`print`
statements are executed::
if x < y < z: print x; print y; print z
Summarizing:
.. productionlist::
compound_stmt: `if_stmt`
: | `while_stmt`
: | `for_stmt`
: | `try_stmt`
: | `with_stmt`
: | `funcdef`
: | `classdef`
: | `decorated`
suite: `stmt_list` NEWLINE | NEWLINE INDENT `statement`+ DEDENT
statement: `stmt_list` NEWLINE | `compound_stmt`
stmt_list: `simple_stmt` (";" `simple_stmt`)* [";"]
.. index::
single: NEWLINE token
single: DEDENT token
pair: dangling; else
Note that statements always end in a ``NEWLINE`` possibly followed by a
``DEDENT``. Also note that optional continuation clauses always begin with a
keyword that cannot start a statement, thus there are no ambiguities (the
'dangling :keyword:`else`' problem is solved in Python by requiring nested
:keyword:`if` statements to be indented).
The formatting of the grammar rules in the following sections places each clause
on a separate line for clarity.
.. _if:
.. _elif:
.. _else:
The :keyword:`if` statement
===========================
.. index::
statement: if
keyword: elif
keyword: else
The :keyword:`if` statement is used for conditional execution:
.. productionlist::
if_stmt: "if" `expression` ":" `suite`
: ( "elif" `expression` ":" `suite` )*
: ["else" ":" `suite`]
It selects exactly one of the suites by evaluating the expressions one by one
until one is found to be true (see section :ref:`booleans` for the definition of
true and false); then that suite is executed (and no other part of the
:keyword:`if` statement is executed or evaluated). If all expressions are
false, the suite of the :keyword:`else` clause, if present, is executed.
.. _while:
The :keyword:`while` statement
==============================
.. index::
statement: while
pair: loop; statement
keyword: else
The :keyword:`while` statement is used for repeated execution as long as an
expression is true:
.. productionlist::
while_stmt: "while" `expression` ":" `suite`
: ["else" ":" `suite`]
This repeatedly tests the expression and, if it is true, executes the first
suite; if the expression is false (which may be the first time it is tested) the
suite of the :keyword:`else` clause, if present, is executed and the loop
terminates.
.. index::
statement: break
statement: continue
A :keyword:`break` statement executed in the first suite terminates the loop
without executing the :keyword:`else` clause's suite. A :keyword:`continue`
statement executed in the first suite skips the rest of the suite and goes back
to testing the expression.
.. _for:
The :keyword:`for` statement
============================
.. index::
statement: for
pair: loop; statement
keyword: in
keyword: else
pair: target; list
object: sequence
The :keyword:`for` statement is used to iterate over the elements of a sequence
(such as a string, tuple or list) or other iterable object:
.. productionlist::
for_stmt: "for" `target_list` "in" `expression_list` ":" `suite`
: ["else" ":" `suite`]
The expression list is evaluated once; it should yield an iterable object. An
iterator is created for the result of the ``expression_list``. The suite is
then executed once for each item provided by the iterator, in the order of
ascending indices. Each item in turn is assigned to the target list using the
standard rules for assignments, and then the suite is executed. When the items
are exhausted (which is immediately when the sequence is empty), the suite in
the :keyword:`else` clause, if present, is executed, and the loop terminates.
.. index::
statement: break
statement: continue
A :keyword:`break` statement executed in the first suite terminates the loop
without executing the :keyword:`else` clause's suite. A :keyword:`continue`
statement executed in the first suite skips the rest of the suite and continues
with the next item, or with the :keyword:`else` clause if there was no next
item.
The suite may assign to the variable(s) in the target list; this does not affect
the next item assigned to it.
.. index::
builtin: range
pair: Pascal; language
The target list is not deleted when the loop is finished, but if the sequence is
empty, it will not have been assigned to at all by the loop. Hint: the built-in
function :func:`range` returns a sequence of integers suitable to emulate the
effect of Pascal's ``for i := a to b do``; e.g., ``range(3)`` returns the list
``[0, 1, 2]``.
.. note::
.. index::
single: loop; over mutable sequence
single: mutable sequence; loop over
There is a subtlety when the sequence is being modified by the loop (this can
only occur for mutable sequences, e.g. lists). An internal counter is used to
keep track of which item is used next, and this is incremented on each
iteration. When this counter has reached the length of the sequence the loop
terminates. This means that if the suite deletes the current (or a previous)
item from the sequence, the next item will be skipped (since it gets the index
of the current item which has already been treated). Likewise, if the suite
inserts an item in the sequence before the current item, the current item will
be treated again the next time through the loop. This can lead to nasty bugs
that can be avoided by making a temporary copy using a slice of the whole
sequence, e.g., ::
for x in a[:]:
if x < 0: a.remove(x)
.. _try:
.. _except:
.. _finally:
The :keyword:`try` statement
============================
.. index::
statement: try
keyword: except
keyword: finally
The :keyword:`try` statement specifies exception handlers and/or cleanup code
for a group of statements:
.. productionlist::
try_stmt: try1_stmt | try2_stmt
try1_stmt: "try" ":" `suite`
: ("except" [`expression` [("as" | ",") `identifier`]] ":" `suite`)+
: ["else" ":" `suite`]
: ["finally" ":" `suite`]
try2_stmt: "try" ":" `suite`
: "finally" ":" `suite`
.. versionchanged:: 2.5
In previous versions of Python, :keyword:`try`...\ :keyword:`except`...\
:keyword:`finally` did not work. :keyword:`try`...\ :keyword:`except` had to be
nested in :keyword:`try`...\ :keyword:`finally`.
The :keyword:`except` clause(s) specify one or more exception handlers. When no
exception occurs in the :keyword:`try` clause, no exception handler is executed.
When an exception occurs in the :keyword:`try` suite, a search for an exception
handler is started. This search inspects the except clauses in turn until one
is found that matches the exception. An expression-less except clause, if
present, must be last; it matches any exception. For an except clause with an
expression, that expression is evaluated, and the clause matches the exception
if the resulting object is "compatible" with the exception. An object is
compatible with an exception if it is the class or a base class of the exception
object, or a tuple containing an item compatible with the exception.
If no except clause matches the exception, the search for an exception handler
continues in the surrounding code and on the invocation stack. [#]_
If the evaluation of an expression in the header of an except clause raises an
exception, the original search for a handler is canceled and a search starts for
the new exception in the surrounding code and on the call stack (it is treated
as if the entire :keyword:`try` statement raised the exception).
When a matching except clause is found, the exception is assigned to the target
specified in that except clause, if present, and the except clause's suite is
executed. All except clauses must have an executable block. When the end of
this block is reached, execution continues normally after the entire try
statement. (This means that if two nested handlers exist for the same
exception, and the exception occurs in the try clause of the inner handler, the
outer handler will not handle the exception.)
.. index::
module: sys
object: traceback
single: exc_type (in module sys)
single: exc_value (in module sys)
single: exc_traceback (in module sys)
Before an except clause's suite is executed, details about the exception are
assigned to three variables in the :mod:`sys` module: ``sys.exc_type`` receives
the object identifying the exception; ``sys.exc_value`` receives the exception's
parameter; ``sys.exc_traceback`` receives a traceback object (see section
:ref:`types`) identifying the point in the program where the exception
occurred. These details are also available through the :func:`sys.exc_info`
function, which returns a tuple ``(exc_type, exc_value, exc_traceback)``. Use
of the corresponding variables is deprecated in favor of this function, since
their use is unsafe in a threaded program. As of Python 1.5, the variables are
restored to their previous values (before the call) when returning from a
function that handled an exception.
.. index::
keyword: else
statement: return
statement: break
statement: continue
The optional :keyword:`else` clause is executed if the control flow leaves the
:keyword:`try` suite, no exception was raised, and no :keyword:`return`,
:keyword:`continue`, or :keyword:`break` statement was executed. Exceptions in
the :keyword:`else` clause are not handled by the preceding :keyword:`except`
clauses.
.. index:: keyword: finally
If :keyword:`finally` is present, it specifies a 'cleanup' handler. The
:keyword:`try` clause is executed, including any :keyword:`except` and
:keyword:`else` clauses. If an exception occurs in any of the clauses and is
not handled, the exception is temporarily saved. The :keyword:`finally` clause
is executed. If there is a saved exception, it is re-raised at the end of the
:keyword:`finally` clause. If the :keyword:`finally` clause raises another
exception or executes a :keyword:`return` or :keyword:`break` statement, the
saved exception is discarded::
>>> def f():
... try:
... 1/0
... finally:
... return 42
...
>>> f()
42
The exception information is not available to the program during execution of
the :keyword:`finally` clause.
.. index::
statement: return
statement: break
statement: continue
When a :keyword:`return`, :keyword:`break` or :keyword:`continue` statement is
executed in the :keyword:`try` suite of a :keyword:`try`...\ :keyword:`finally`
statement, the :keyword:`finally` clause is also executed 'on the way out.' A
:keyword:`continue` statement is illegal in the :keyword:`finally` clause. (The
reason is a problem with the current implementation --- this restriction may be
lifted in the future).
The return value of a function is determined by the last :keyword:`return`
statement executed. Since the :keyword:`finally` clause always executes, a
:keyword:`return` statement executed in the :keyword:`finally` clause will
always be the last one executed::
>>> def foo():
... try:
... return 'try'
... finally:
... return 'finally'
...
>>> foo()
'finally'
Additional information on exceptions can be found in section :ref:`exceptions`,
and information on using the :keyword:`raise` statement to generate exceptions
may be found in section :ref:`raise`.
.. _with:
.. _as:
The :keyword:`with` statement
=============================
.. index::
statement: with
single: as; with statement
.. versionadded:: 2.5
The :keyword:`with` statement is used to wrap the execution of a block with
methods defined by a context manager (see section :ref:`context-managers`). This
allows common :keyword:`try`...\ :keyword:`except`...\ :keyword:`finally` usage
patterns to be encapsulated for convenient reuse.
.. productionlist::
with_stmt: "with" with_item ("," with_item)* ":" `suite`
with_item: `expression` ["as" `target`]
The execution of the :keyword:`with` statement with one "item" proceeds as follows:
#. The context expression (the expression given in the :token:`with_item`) is
evaluated to obtain a context manager.
#. The context manager's :meth:`__exit__` is loaded for later use.
#. The context manager's :meth:`__enter__` method is invoked.
#. If a target was included in the :keyword:`with` statement, the return value
from :meth:`__enter__` is assigned to it.
.. note::
The :keyword:`with` statement guarantees that if the :meth:`__enter__` method
returns without an error, then :meth:`__exit__` will always be called. Thus, if
an error occurs during the assignment to the target list, it will be treated the
same as an error occurring within the suite would be. See step 6 below.
#. The suite is executed.
#. The context manager's :meth:`__exit__` method is invoked. If an exception
caused the suite to be exited, its type, value, and traceback are passed as
arguments to :meth:`__exit__`. Otherwise, three :const:`None` arguments are
supplied.
If the suite was exited due to an exception, and the return value from the
:meth:`__exit__` method was false, the exception is reraised. If the return
value was true, the exception is suppressed, and execution continues with the
statement following the :keyword:`with` statement.
If the suite was exited for any reason other than an exception, the return value
from :meth:`__exit__` is ignored, and execution proceeds at the normal location
for the kind of exit that was taken.
With more than one item, the context managers are processed as if multiple
:keyword:`with` statements were nested::
with A() as a, B() as b:
suite
is equivalent to ::
with A() as a:
with B() as b:
suite
.. note::
In Python 2.5, the :keyword:`with` statement is only allowed when the
``with_statement`` feature has been enabled. It is always enabled in
Python 2.6.
.. versionchanged:: 2.7
Support for multiple context expressions.
.. seealso::
:pep:`343` - The "with" statement
The specification, background, and examples for the Python :keyword:`with`
statement.
.. index::
single: parameter; function definition
.. _function:
.. _def:
Function definitions
====================
.. index::
statement: def
pair: function; definition
pair: function; name
pair: name; binding
object: user-defined function
object: function
A function definition defines a user-defined function object (see section
:ref:`types`):
.. productionlist::
decorated: decorators (classdef | funcdef)
decorators: `decorator`+
decorator: "@" `dotted_name` ["(" [`argument_list` [","]] ")"] NEWLINE
funcdef: "def" `funcname` "(" [`parameter_list`] ")" ":" `suite`
dotted_name: `identifier` ("." `identifier`)*
parameter_list: (`defparameter` ",")*
: ( "*" `identifier` ["," "**" `identifier`]
: | "**" `identifier`
: | `defparameter` [","] )
defparameter: `parameter` ["=" `expression`]
sublist: `parameter` ("," `parameter`)* [","]
parameter: `identifier` | "(" `sublist` ")"
funcname: `identifier`
A function definition is an executable statement. Its execution binds the
function name in the current local namespace to a function object (a wrapper
around the executable code for the function). This function object contains a
reference to the current global namespace as the global namespace to be used
when the function is called.
The function definition does not execute the function body; this gets executed
only when the function is called. [#]_
.. index::
statement: @
A function definition may be wrapped by one or more :term:`decorator` expressions.
Decorator expressions are evaluated when the function is defined, in the scope
that contains the function definition. The result must be a callable, which is
invoked with the function object as the only argument. The returned value is
bound to the function name instead of the function object. Multiple decorators
are applied in nested fashion. For example, the following code::
@f1(arg)
@f2
def func(): pass
is equivalent to::
def func(): pass
func = f1(arg)(f2(func))
.. index::
triple: default; parameter; value
single: argument; function definition
When one or more top-level :term:`parameters <parameter>` have the form
*parameter* ``=`` *expression*, the function is said to have "default parameter
values." For a parameter with a default value, the corresponding
:term:`argument` may be omitted from a call, in which
case the parameter's default value is substituted. If a
parameter has a default value, all following parameters must also have a default
value --- this is a syntactic restriction that is not expressed by the grammar.
**Default parameter values are evaluated when the function definition is
executed.** This means that the expression is evaluated once, when the function
is defined, and that the same "pre-computed" value is used for each call. This
is especially important to understand when a default parameter is a mutable
object, such as a list or a dictionary: if the function modifies the object
(e.g. by appending an item to a list), the default value is in effect modified.
This is generally not what was intended. A way around this is to use ``None``
as the default, and explicitly test for it in the body of the function, e.g.::
def whats_on_the_telly(penguin=None):
if penguin is None:
penguin = []
penguin.append("property of the zoo")
return penguin
.. index::
statement: *
statement: **
Function call semantics are described in more detail in section :ref:`calls`. A
function call always assigns values to all parameters mentioned in the parameter
list, either from position arguments, from keyword arguments, or from default
values. If the form "``*identifier``" is present, it is initialized to a tuple
receiving any excess positional parameters, defaulting to the empty tuple. If
the form "``**identifier``" is present, it is initialized to a new dictionary
receiving any excess keyword arguments, defaulting to a new empty dictionary.
.. index:: pair: lambda; expression
It is also possible to create anonymous functions (functions not bound to a
name), for immediate use in expressions. This uses lambda expressions, described in
section :ref:`lambda`. Note that the lambda expression is merely a shorthand for a
simplified function definition; a function defined in a ":keyword:`def`"
statement can be passed around or assigned to another name just like a function
defined by a lambda expression. The ":keyword:`def`" form is actually more powerful
since it allows the execution of multiple statements.
**Programmer's note:** Functions are first-class objects. A "``def``" form
executed inside a function definition defines a local function that can be
returned or passed around. Free variables used in the nested function can
access the local variables of the function containing the def. See section
:ref:`naming` for details.
.. _class:
Class definitions
=================
.. index::
object: class
statement: class
pair: class; definition
pair: class; name
pair: name; binding
pair: execution; frame
single: inheritance
single: docstring
A class definition defines a class object (see section :ref:`types`):
.. productionlist::
classdef: "class" `classname` [`inheritance`] ":" `suite`
inheritance: "(" [`expression_list`] ")"
classname: `identifier`
A class definition is an executable statement. It first evaluates the
inheritance list, if present. Each item in the inheritance list should evaluate
to a class object or class type which allows subclassing. The class's suite is
then executed in a new execution frame (see section :ref:`naming`), using a
newly created local namespace and the original global namespace. (Usually, the
suite contains only function definitions.) When the class's suite finishes
execution, its execution frame is discarded but its local namespace is
saved. [#]_ A class object is then created using the inheritance list for the
base classes and the saved local namespace for the attribute dictionary. The
class name is bound to this class object in the original local namespace.
**Programmer's note:** Variables defined in the class definition are class
variables; they are shared by all instances. To create instance variables, they
can be set in a method with ``self.name = value``. Both class and instance
variables are accessible through the notation "``self.name``", and an instance
variable hides a class variable with the same name when accessed in this way.
Class variables can be used as defaults for instance variables, but using
mutable values there can lead to unexpected results. For :term:`new-style
class`\es, descriptors can be used to create instance variables with different
implementation details.
Class definitions, like function definitions, may be wrapped by one or more
:term:`decorator` expressions. The evaluation rules for the decorator
expressions are the same as for functions. The result must be a class object,
which is then bound to the class name.
.. rubric:: Footnotes
.. [#] The exception is propagated to the invocation stack unless
there is a :keyword:`finally` clause which happens to raise another
exception. That new exception causes the old one to be lost.
.. [#] A string literal appearing as the first statement in the function body is
transformed into the function's ``__doc__`` attribute and therefore the
function's :term:`docstring`.
.. [#] A string literal appearing as the first statement in the class body is
transformed into the namespace's ``__doc__`` item and therefore the class's
:term:`docstring`.

2528
Doc/reference/datamodel.rst Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,247 @@
.. _execmodel:
***************
Execution model
***************
.. index:: single: execution model
.. _naming:
Naming and binding
==================
.. index::
pair: code; block
single: namespace
single: scope
.. index::
single: name
pair: binding; name
:dfn:`Names` refer to objects. Names are introduced by name binding operations.
Each occurrence of a name in the program text refers to the :dfn:`binding` of
that name established in the innermost function block containing the use.
.. index:: single: block
A :dfn:`block` is a piece of Python program text that is executed as a unit.
The following are blocks: a module, a function body, and a class definition.
Each command typed interactively is a block. A script file (a file given as
standard input to the interpreter or specified on the interpreter command line
the first argument) is a code block. A script command (a command specified on
the interpreter command line with the '**-c**' option) is a code block. The
file read by the built-in function :func:`execfile` is a code block. The string
argument passed to the built-in function :func:`eval` and to the :keyword:`exec`
statement is a code block. The expression read and evaluated by the built-in
function :func:`input` is a code block.
.. index:: pair: execution; frame
A code block is executed in an :dfn:`execution frame`. A frame contains some
administrative information (used for debugging) and determines where and how
execution continues after the code block's execution has completed.
.. index:: single: scope
A :dfn:`scope` defines the visibility of a name within a block. If a local
variable is defined in a block, its scope includes that block. If the
definition occurs in a function block, the scope extends to any blocks contained
within the defining one, unless a contained block introduces a different binding
for the name. The scope of names defined in a class block is limited to the
class block; it does not extend to the code blocks of methods -- this includes
generator expressions since they are implemented using a function scope. This
means that the following will fail::
class A:
a = 42
b = list(a + i for i in range(10))
.. index:: single: environment
When a name is used in a code block, it is resolved using the nearest enclosing
scope. The set of all such scopes visible to a code block is called the block's
:dfn:`environment`.
.. index:: pair: free; variable
If a name is bound in a block, it is a local variable of that block. If a name
is bound at the module level, it is a global variable. (The variables of the
module code block are local and global.) If a variable is used in a code block
but not defined there, it is a :dfn:`free variable`.
.. index::
single: NameError (built-in exception)
single: UnboundLocalError
When a name is not found at all, a :exc:`NameError` exception is raised. If the
name refers to a local variable that has not been bound, a
:exc:`UnboundLocalError` exception is raised. :exc:`UnboundLocalError` is a
subclass of :exc:`NameError`.
.. index:: statement: from
The following constructs bind names: formal parameters to functions,
:keyword:`import` statements, class and function definitions (these bind the
class or function name in the defining block), and targets that are identifiers
if occurring in an assignment, :keyword:`for` loop header, in the second
position of an :keyword:`except` clause header or after :keyword:`as` in a
:keyword:`with` statement. The :keyword:`import` statement
of the form ``from ... import *`` binds all names defined in the imported
module, except those beginning with an underscore. This form may only be used
at the module level.
A target occurring in a :keyword:`del` statement is also considered bound for
this purpose (though the actual semantics are to unbind the name). It is
illegal to unbind a name that is referenced by an enclosing scope; the compiler
will report a :exc:`SyntaxError`.
Each assignment or import statement occurs within a block defined by a class or
function definition or at the module level (the top-level code block).
If a name binding operation occurs anywhere within a code block, all uses of the
name within the block are treated as references to the current block. This can
lead to errors when a name is used within a block before it is bound. This rule
is subtle. Python lacks declarations and allows name binding operations to
occur anywhere within a code block. The local variables of a code block can be
determined by scanning the entire text of the block for name binding operations.
If the global statement occurs within a block, all uses of the name specified in
the statement refer to the binding of that name in the top-level namespace.
Names are resolved in the top-level namespace by searching the global namespace,
i.e. the namespace of the module containing the code block, and the builtins
namespace, the namespace of the module :mod:`__builtin__`. The global namespace
is searched first. If the name is not found there, the builtins namespace is
searched. The global statement must precede all uses of the name.
.. index:: pair: restricted; execution
The builtins namespace associated with the execution of a code block is actually
found by looking up the name ``__builtins__`` in its global namespace; this
should be a dictionary or a module (in the latter case the module's dictionary
is used). By default, when in the :mod:`__main__` module, ``__builtins__`` is
the built-in module :mod:`__builtin__` (note: no 's'); when in any other module,
``__builtins__`` is an alias for the dictionary of the :mod:`__builtin__` module
itself. ``__builtins__`` can be set to a user-created dictionary to create a
weak form of restricted execution.
.. impl-detail::
Users should not touch ``__builtins__``; it is strictly an implementation
detail. Users wanting to override values in the builtins namespace should
:keyword:`import` the :mod:`__builtin__` (no 's') module and modify its
attributes appropriately.
.. index:: module: __main__
The namespace for a module is automatically created the first time a module is
imported. The main module for a script is always called :mod:`__main__`.
The :keyword:`global` statement has the same scope as a name binding operation
in the same block. If the nearest enclosing scope for a free variable contains
a global statement, the free variable is treated as a global.
A class definition is an executable statement that may use and define names.
These references follow the normal rules for name resolution. The namespace of
the class definition becomes the attribute dictionary of the class. Names
defined at the class scope are not visible in methods.
.. _dynamic-features:
Interaction with dynamic features
---------------------------------
There are several cases where Python statements are illegal when used in
conjunction with nested scopes that contain free variables.
If a variable is referenced in an enclosing scope, it is illegal to delete the
name. An error will be reported at compile time.
If the wild card form of import --- ``import *`` --- is used in a function and
the function contains or is a nested block with free variables, the compiler
will raise a :exc:`SyntaxError`.
If :keyword:`exec` is used in a function and the function contains or is a
nested block with free variables, the compiler will raise a :exc:`SyntaxError`
unless the exec explicitly specifies the local namespace for the
:keyword:`exec`. (In other words, ``exec obj`` would be illegal, but ``exec obj
in ns`` would be legal.)
The :func:`eval`, :func:`execfile`, and :func:`input` functions and the
:keyword:`exec` statement do not have access to the full environment for
resolving names. Names may be resolved in the local and global namespaces of
the caller. Free variables are not resolved in the nearest enclosing namespace,
but in the global namespace. [#]_ The :keyword:`exec` statement and the
:func:`eval` and :func:`execfile` functions have optional arguments to override
the global and local namespace. If only one namespace is specified, it is used
for both.
.. _exceptions:
Exceptions
==========
.. index:: single: exception
.. index::
single: raise an exception
single: handle an exception
single: exception handler
single: errors
single: error handling
Exceptions are a means of breaking out of the normal flow of control of a code
block in order to handle errors or other exceptional conditions. An exception
is *raised* at the point where the error is detected; it may be *handled* by the
surrounding code block or by any code block that directly or indirectly invoked
the code block where the error occurred.
The Python interpreter raises an exception when it detects a run-time error
(such as division by zero). A Python program can also explicitly raise an
exception with the :keyword:`raise` statement. Exception handlers are specified
with the :keyword:`try` ... :keyword:`except` statement. The :keyword:`finally`
clause of such a statement can be used to specify cleanup code which does not
handle the exception, but is executed whether an exception occurred or not in
the preceding code.
.. index:: single: termination model
Python uses the "termination" model of error handling: an exception handler can
find out what happened and continue execution at an outer level, but it cannot
repair the cause of the error and retry the failing operation (except by
re-entering the offending piece of code from the top).
.. index:: single: SystemExit (built-in exception)
When an exception is not handled at all, the interpreter terminates execution of
the program, or returns to its interactive main loop. In either case, it prints
a stack backtrace, except when the exception is :exc:`SystemExit`.
Exceptions are identified by class instances. The :keyword:`except` clause is
selected depending on the class of the instance: it must reference the class of
the instance or a base class thereof. The instance can be received by the
handler and can carry additional information about the exceptional condition.
Exceptions can also be identified by strings, in which case the
:keyword:`except` clause is selected by object identity. An arbitrary value can
be raised along with the identifying string which can be passed to the handler.
.. note::
Messages to exceptions are not part of the Python API. Their contents may
change from one version of Python to the next without warning and should not be
relied on by code which will run under multiple versions of the interpreter.
See also the description of the :keyword:`try` statement in section :ref:`try`
and :keyword:`raise` statement in section :ref:`raise`.
.. rubric:: Footnotes
.. [#] This limitation occurs because the code that is executed by these operations is
not available at the time the module is compiled.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,7 @@
Full Grammar specification
==========================
This is the full Python grammar, as it is read by the parser generator and used
to parse Python source files:
.. literalinclude:: ../../Grammar/Grammar

28
Doc/reference/index.rst Normal file
View File

@@ -0,0 +1,28 @@
.. _reference-index:
#################################
The Python Language Reference
#################################
This reference manual describes the syntax and "core semantics" of the
language. It is terse, but attempts to be exact and complete. The semantics of
non-essential built-in object types and of the built-in functions and modules
are described in :ref:`library-index`. For an informal introduction to the
language, see :ref:`tutorial-index`. For C or C++ programmers, two additional
manuals exist: :ref:`extending-index` describes the high-level picture of how to
write a Python extension module, and the :ref:`c-api-index` describes the
interfaces available to C/C++ programmers in detail.
.. toctree::
:maxdepth: 2
:numbered:
introduction.rst
lexical_analysis.rst
datamodel.rst
executionmodel.rst
expressions.rst
simple_stmts.rst
compound_stmts.rst
toplevel_components.rst
grammar.rst

View File

@@ -0,0 +1,137 @@
.. _introduction:
************
Introduction
************
This reference manual describes the Python programming language. It is not
intended as a tutorial.
While I am trying to be as precise as possible, I chose to use English rather
than formal specifications for everything except syntax and lexical analysis.
This should make the document more understandable to the average reader, but
will leave room for ambiguities. Consequently, if you were coming from Mars and
tried to re-implement Python from this document alone, you might have to guess
things and in fact you would probably end up implementing quite a different
language. On the other hand, if you are using Python and wonder what the precise
rules about a particular area of the language are, you should definitely be able
to find them here. If you would like to see a more formal definition of the
language, maybe you could volunteer your time --- or invent a cloning machine
:-).
It is dangerous to add too many implementation details to a language reference
document --- the implementation may change, and other implementations of the
same language may work differently. On the other hand, there is currently only
one Python implementation in widespread use (although alternate implementations
exist), and its particular quirks are sometimes worth being mentioned,
especially where the implementation imposes additional limitations. Therefore,
you'll find short "implementation notes" sprinkled throughout the text.
Every Python implementation comes with a number of built-in and standard
modules. These are documented in :ref:`library-index`. A few built-in modules
are mentioned when they interact in a significant way with the language
definition.
.. _implementations:
Alternate Implementations
=========================
Though there is one Python implementation which is by far the most popular,
there are some alternate implementations which are of particular interest to
different audiences.
Known implementations include:
CPython
This is the original and most-maintained implementation of Python, written in C.
New language features generally appear here first.
Jython
Python implemented in Java. This implementation can be used as a scripting
language for Java applications, or can be used to create applications using the
Java class libraries. It is also often used to create tests for Java libraries.
More information can be found at `the Jython website <http://www.jython.org/>`_.
Python for .NET
This implementation actually uses the CPython implementation, but is a managed
.NET application and makes .NET libraries available. It was created by Brian
Lloyd. For more information, see the `Python for .NET home page
<https://pythonnet.github.io/>`_.
IronPython
An alternate Python for .NET. Unlike Python.NET, this is a complete Python
implementation that generates IL, and compiles Python code directly to .NET
assemblies. It was created by Jim Hugunin, the original creator of Jython. For
more information, see `the IronPython website <http://ironpython.net/>`_.
PyPy
An implementation of Python written completely in Python. It supports several
advanced features not found in other implementations like stackless support
and a Just in Time compiler. One of the goals of the project is to encourage
experimentation with the language itself by making it easier to modify the
interpreter (since it is written in Python). Additional information is
available on `the PyPy project's home page <http://pypy.org/>`_.
Each of these implementations varies in some way from the language as documented
in this manual, or introduces specific information beyond what's covered in the
standard Python documentation. Please refer to the implementation-specific
documentation to determine what else you need to know about the specific
implementation you're using.
.. _notation:
Notation
========
.. index::
single: BNF
single: grammar
single: syntax
single: notation
The descriptions of lexical analysis and syntax use a modified BNF grammar
notation. This uses the following style of definition:
.. productionlist:: *
name: `lc_letter` (`lc_letter` | "_")*
lc_letter: "a"..."z"
The first line says that a ``name`` is an ``lc_letter`` followed by a sequence
of zero or more ``lc_letter``\ s and underscores. An ``lc_letter`` in turn is
any of the single characters ``'a'`` through ``'z'``. (This rule is actually
adhered to for the names defined in lexical and grammar rules in this document.)
Each rule begins with a name (which is the name defined by the rule) and
``::=``. A vertical bar (``|``) is used to separate alternatives; it is the
least binding operator in this notation. A star (``*``) means zero or more
repetitions of the preceding item; likewise, a plus (``+``) means one or more
repetitions, and a phrase enclosed in square brackets (``[ ]``) means zero or
one occurrences (in other words, the enclosed phrase is optional). The ``*``
and ``+`` operators bind as tightly as possible; parentheses are used for
grouping. Literal strings are enclosed in quotes. White space is only
meaningful to separate tokens. Rules are normally contained on a single line;
rules with many alternatives may be formatted alternatively with each line after
the first beginning with a vertical bar.
.. index::
single: lexical definitions
single: ASCII@ASCII
In lexical definitions (as the example above), two more conventions are used:
Two literal characters separated by three dots mean a choice of any single
character in the given (inclusive) range of ASCII characters. A phrase between
angular brackets (``<...>``) gives an informal description of the symbol
defined; e.g., this could be used to describe the notion of 'control character'
if needed.
Even though the notation used is almost the same, there is a big difference
between the meaning of lexical and syntactic definitions: a lexical definition
operates on the individual characters of the input source, while a syntax
definition operates on the stream of tokens generated by the lexical analysis.
All uses of BNF in the next chapter ("Lexical Analysis") are lexical
definitions; uses in subsequent chapters are syntactic definitions.

View File

@@ -0,0 +1,776 @@
.. _lexical:
****************
Lexical analysis
****************
.. index::
single: lexical analysis
single: parser
single: token
A Python program is read by a *parser*. Input to the parser is a stream of
*tokens*, generated by the *lexical analyzer*. This chapter describes how the
lexical analyzer breaks a file into tokens.
Python uses the 7-bit ASCII character set for program text.
.. versionadded:: 2.3
An encoding declaration can be used to indicate that string literals and
comments use an encoding different from ASCII.
For compatibility with older versions, Python only warns if it finds 8-bit
characters; those warnings should be corrected by either declaring an explicit
encoding, or using escape sequences if those bytes are binary data, instead of
characters.
The run-time character set depends on the I/O devices connected to the program
but is generally a superset of ASCII.
**Future compatibility note:** It may be tempting to assume that the character
set for 8-bit characters is ISO Latin-1 (an ASCII superset that covers most
western languages that use the Latin alphabet), but it is possible that in the
future Unicode text editors will become common. These generally use the UTF-8
encoding, which is also an ASCII superset, but with very different use for the
characters with ordinals 128-255. While there is no consensus on this subject
yet, it is unwise to assume either Latin-1 or UTF-8, even though the current
implementation appears to favor Latin-1. This applies both to the source
character set and the run-time character set.
.. _line-structure:
Line structure
==============
.. index:: single: line structure
A Python program is divided into a number of *logical lines*.
.. _logical:
Logical lines
-------------
.. index::
single: logical line
single: physical line
single: line joining
single: NEWLINE token
The end of a logical line is represented by the token NEWLINE. Statements
cannot cross logical line boundaries except where NEWLINE is allowed by the
syntax (e.g., between statements in compound statements). A logical line is
constructed from one or more *physical lines* by following the explicit or
implicit *line joining* rules.
.. _physical:
Physical lines
--------------
A physical line is a sequence of characters terminated by an end-of-line
sequence. In source files and strings, any of the standard platform line
termination sequences can be used - the Unix form using ASCII LF (linefeed),
the Windows form using the ASCII sequence CR LF (return followed by linefeed),
or the old Macintosh form using the ASCII CR (return) character. All of these
forms can be used equally, regardless of platform. The end of input also serves
as an implicit terminator for the final physical line.
When embedding Python, source code strings should be passed to Python APIs using
the standard C conventions for newline characters (the ``\n`` character,
representing ASCII LF, is the line terminator).
.. _comments:
Comments
--------
.. index::
single: comment
single: hash character
A comment starts with a hash character (``#``) that is not part of a string
literal, and ends at the end of the physical line. A comment signifies the end
of the logical line unless the implicit line joining rules are invoked. Comments
are ignored by the syntax; they are not tokens.
.. _encodings:
Encoding declarations
---------------------
.. index:: source character set, encoding declarations (source file)
If a comment in the first or second line of the Python script matches the
regular expression ``coding[=:]\s*([-\w.]+)``, this comment is processed as an
encoding declaration; the first group of this expression names the encoding of
the source code file. The encoding declaration must appear on a line of its
own. If it is the second line, the first line must also be a comment-only line.
The recommended forms of an encoding expression are ::
# -*- coding: <encoding-name> -*-
which is recognized also by GNU Emacs, and ::
# vim:fileencoding=<encoding-name>
which is recognized by Bram Moolenaar's VIM. In addition, if the first bytes of
the file are the UTF-8 byte-order mark (``'\xef\xbb\xbf'``), the declared file
encoding is UTF-8 (this is supported, among others, by Microsoft's
:program:`notepad`).
If an encoding is declared, the encoding name must be recognized by Python. The
encoding is used for all lexical analysis, in particular to find the end of a
string, and to interpret the contents of Unicode literals. String literals are
converted to Unicode for syntactical analysis, then converted back to their
original encoding before interpretation starts.
.. XXX there should be a list of supported encodings.
.. _explicit-joining:
Explicit line joining
---------------------
.. index::
single: physical line
single: line joining
single: line continuation
single: backslash character
Two or more physical lines may be joined into logical lines using backslash
characters (``\``), as follows: when a physical line ends in a backslash that is
not part of a string literal or comment, it is joined with the following forming
a single logical line, deleting the backslash and the following end-of-line
character. For example::
if 1900 < year < 2100 and 1 <= month <= 12 \
and 1 <= day <= 31 and 0 <= hour < 24 \
and 0 <= minute < 60 and 0 <= second < 60: # Looks like a valid date
return 1
A line ending in a backslash cannot carry a comment. A backslash does not
continue a comment. A backslash does not continue a token except for string
literals (i.e., tokens other than string literals cannot be split across
physical lines using a backslash). A backslash is illegal elsewhere on a line
outside a string literal.
.. _implicit-joining:
Implicit line joining
---------------------
Expressions in parentheses, square brackets or curly braces can be split over
more than one physical line without using backslashes. For example::
month_names = ['Januari', 'Februari', 'Maart', # These are the
'April', 'Mei', 'Juni', # Dutch names
'Juli', 'Augustus', 'September', # for the months
'Oktober', 'November', 'December'] # of the year
Implicitly continued lines can carry comments. The indentation of the
continuation lines is not important. Blank continuation lines are allowed.
There is no NEWLINE token between implicit continuation lines. Implicitly
continued lines can also occur within triple-quoted strings (see below); in that
case they cannot carry comments.
.. _blank-lines:
Blank lines
-----------
.. index:: single: blank line
A logical line that contains only spaces, tabs, formfeeds and possibly a
comment, is ignored (i.e., no NEWLINE token is generated). During interactive
input of statements, handling of a blank line may differ depending on the
implementation of the read-eval-print loop. In the standard implementation, an
entirely blank logical line (i.e. one containing not even whitespace or a
comment) terminates a multi-line statement.
.. _indentation:
Indentation
-----------
.. index::
single: indentation
single: whitespace
single: leading whitespace
single: space
single: tab
single: grouping
single: statement grouping
Leading whitespace (spaces and tabs) at the beginning of a logical line is used
to compute the indentation level of the line, which in turn is used to determine
the grouping of statements.
First, tabs are replaced (from left to right) by one to eight spaces such that
the total number of characters up to and including the replacement is a multiple
of eight (this is intended to be the same rule as used by Unix). The total
number of spaces preceding the first non-blank character then determines the
line's indentation. Indentation cannot be split over multiple physical lines
using backslashes; the whitespace up to the first backslash determines the
indentation.
**Cross-platform compatibility note:** because of the nature of text editors on
non-UNIX platforms, it is unwise to use a mixture of spaces and tabs for the
indentation in a single source file. It should also be noted that different
platforms may explicitly limit the maximum indentation level.
A formfeed character may be present at the start of the line; it will be ignored
for the indentation calculations above. Formfeed characters occurring elsewhere
in the leading whitespace have an undefined effect (for instance, they may reset
the space count to zero).
.. index::
single: INDENT token
single: DEDENT token
The indentation levels of consecutive lines are used to generate INDENT and
DEDENT tokens, using a stack, as follows.
Before the first line of the file is read, a single zero is pushed on the stack;
this will never be popped off again. The numbers pushed on the stack will
always be strictly increasing from bottom to top. At the beginning of each
logical line, the line's indentation level is compared to the top of the stack.
If it is equal, nothing happens. If it is larger, it is pushed on the stack, and
one INDENT token is generated. If it is smaller, it *must* be one of the
numbers occurring on the stack; all numbers on the stack that are larger are
popped off, and for each number popped off a DEDENT token is generated. At the
end of the file, a DEDENT token is generated for each number remaining on the
stack that is larger than zero.
Here is an example of a correctly (though confusingly) indented piece of Python
code::
def perm(l):
# Compute the list of all permutations of l
if len(l) <= 1:
return [l]
r = []
for i in range(len(l)):
s = l[:i] + l[i+1:]
p = perm(s)
for x in p:
r.append(l[i:i+1] + x)
return r
The following example shows various indentation errors::
def perm(l): # error: first line indented
for i in range(len(l)): # error: not indented
s = l[:i] + l[i+1:]
p = perm(l[:i] + l[i+1:]) # error: unexpected indent
for x in p:
r.append(l[i:i+1] + x)
return r # error: inconsistent dedent
(Actually, the first three errors are detected by the parser; only the last
error is found by the lexical analyzer --- the indentation of ``return r`` does
not match a level popped off the stack.)
.. _whitespace:
Whitespace between tokens
-------------------------
Except at the beginning of a logical line or in string literals, the whitespace
characters space, tab and formfeed can be used interchangeably to separate
tokens. Whitespace is needed between two tokens only if their concatenation
could otherwise be interpreted as a different token (e.g., ab is one token, but
a b is two tokens).
.. _other-tokens:
Other tokens
============
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens exist:
*identifiers*, *keywords*, *literals*, *operators*, and *delimiters*. Whitespace
characters (other than line terminators, discussed earlier) are not tokens, but
serve to delimit tokens. Where ambiguity exists, a token comprises the longest
possible string that forms a legal token, when read from left to right.
.. _identifiers:
Identifiers and keywords
========================
.. index::
single: identifier
single: name
Identifiers (also referred to as *names*) are described by the following lexical
definitions:
.. productionlist::
identifier: (`letter`|"_") (`letter` | `digit` | "_")*
letter: `lowercase` | `uppercase`
lowercase: "a"..."z"
uppercase: "A"..."Z"
digit: "0"..."9"
Identifiers are unlimited in length. Case is significant.
.. _keywords:
Keywords
--------
.. index::
single: keyword
single: reserved word
The following identifiers are used as reserved words, or *keywords* of the
language, and cannot be used as ordinary identifiers. They must be spelled
exactly as written here:
.. sourcecode:: text
and del from not while
as elif global or with
assert else if pass yield
break except import print
class exec in raise
continue finally is return
def for lambda try
.. versionchanged:: 2.4
:const:`None` became a constant and is now recognized by the compiler as a name
for the built-in object :const:`None`. Although it is not a keyword, you cannot
assign a different object to it.
.. versionchanged:: 2.5
Using :keyword:`as` and :keyword:`with` as identifiers triggers a warning. To
use them as keywords, enable the ``with_statement`` future feature .
.. versionchanged:: 2.6
:keyword:`as` and :keyword:`with` are full keywords.
.. _id-classes:
Reserved classes of identifiers
-------------------------------
Certain classes of identifiers (besides keywords) have special meanings. These
classes are identified by the patterns of leading and trailing underscore
characters:
``_*``
Not imported by ``from module import *``. The special identifier ``_`` is used
in the interactive interpreter to store the result of the last evaluation; it is
stored in the :mod:`__builtin__` module. When not in interactive mode, ``_``
has no special meaning and is not defined. See section :ref:`import`.
.. note::
The name ``_`` is often used in conjunction with internationalization;
refer to the documentation for the :mod:`gettext` module for more
information on this convention.
``__*__``
System-defined names. These names are defined by the interpreter and its
implementation (including the standard library). Current system names are
discussed in the :ref:`specialnames` section and elsewhere. More will likely
be defined in future versions of Python. *Any* use of ``__*__`` names, in
any context, that does not follow explicitly documented use, is subject to
breakage without warning.
``__*``
Class-private names. Names in this category, when used within the context of a
class definition, are re-written to use a mangled form to help avoid name
clashes between "private" attributes of base and derived classes. See section
:ref:`atom-identifiers`.
.. _literals:
Literals
========
.. index::
single: literal
single: constant
Literals are notations for constant values of some built-in types.
.. _strings:
String literals
---------------
.. index:: single: string literal
String literals are described by the following lexical definitions:
.. index:: single: ASCII@ASCII
.. productionlist::
stringliteral: [`stringprefix`](`shortstring` | `longstring`)
stringprefix: "r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR"
: | "b" | "B" | "br" | "Br" | "bR" | "BR"
shortstring: "'" `shortstringitem`* "'" | '"' `shortstringitem`* '"'
longstring: "'''" `longstringitem`* "'''"
: | '"""' `longstringitem`* '"""'
shortstringitem: `shortstringchar` | `escapeseq`
longstringitem: `longstringchar` | `escapeseq`
shortstringchar: <any source character except "\" or newline or the quote>
longstringchar: <any source character except "\">
escapeseq: "\" <any ASCII character>
One syntactic restriction not indicated by these productions is that whitespace
is not allowed between the :token:`stringprefix` and the rest of the string
literal. The source character set is defined by the encoding declaration; it is
ASCII if no encoding declaration is given in the source file; see section
:ref:`encodings`.
.. index::
single: triple-quoted string
single: Unicode Consortium
single: string; Unicode
single: raw string
In plain English: String literals can be enclosed in matching single quotes
(``'``) or double quotes (``"``). They can also be enclosed in matching groups
of three single or double quotes (these are generally referred to as
*triple-quoted strings*). The backslash (``\``) character is used to escape
characters that otherwise have a special meaning, such as newline, backslash
itself, or the quote character. String literals may optionally be prefixed with
a letter ``'r'`` or ``'R'``; such strings are called :dfn:`raw strings` and use
different rules for interpreting backslash escape sequences. A prefix of
``'u'`` or ``'U'`` makes the string a Unicode string. Unicode strings use the
Unicode character set as defined by the Unicode Consortium and ISO 10646. Some
additional escape sequences, described below, are available in Unicode strings.
A prefix of ``'b'`` or ``'B'`` is ignored in Python 2; it indicates that the
literal should become a bytes literal in Python 3 (e.g. when code is
automatically converted with 2to3). A ``'u'`` or ``'b'`` prefix may be followed
by an ``'r'`` prefix.
In triple-quoted strings, unescaped newlines and quotes are allowed (and are
retained), except that three unescaped quotes in a row terminate the string. (A
"quote" is the character used to open the string, i.e. either ``'`` or ``"``.)
.. index::
single: physical line
single: escape sequence
single: Standard C
single: C
Unless an ``'r'`` or ``'R'`` prefix is present, escape sequences in strings are
interpreted according to rules similar to those used by Standard C. The
recognized escape sequences are:
+-----------------+---------------------------------+-------+
| Escape Sequence | Meaning | Notes |
+=================+=================================+=======+
| ``\newline`` | Ignored | |
+-----------------+---------------------------------+-------+
| ``\\`` | Backslash (``\``) | |
+-----------------+---------------------------------+-------+
| ``\'`` | Single quote (``'``) | |
+-----------------+---------------------------------+-------+
| ``\"`` | Double quote (``"``) | |
+-----------------+---------------------------------+-------+
| ``\a`` | ASCII Bell (BEL) | |
+-----------------+---------------------------------+-------+
| ``\b`` | ASCII Backspace (BS) | |
+-----------------+---------------------------------+-------+
| ``\f`` | ASCII Formfeed (FF) | |
+-----------------+---------------------------------+-------+
| ``\n`` | ASCII Linefeed (LF) | |
+-----------------+---------------------------------+-------+
| ``\N{name}`` | Character named *name* in the | |
| | Unicode database (Unicode only) | |
+-----------------+---------------------------------+-------+
| ``\r`` | ASCII Carriage Return (CR) | |
+-----------------+---------------------------------+-------+
| ``\t`` | ASCII Horizontal Tab (TAB) | |
+-----------------+---------------------------------+-------+
| ``\uxxxx`` | Character with 16-bit hex value | \(1) |
| | *xxxx* (Unicode only) | |
+-----------------+---------------------------------+-------+
| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(2) |
| | *xxxxxxxx* (Unicode only) | |
+-----------------+---------------------------------+-------+
| ``\v`` | ASCII Vertical Tab (VT) | |
+-----------------+---------------------------------+-------+
| ``\ooo`` | Character with octal value | (3,5) |
| | *ooo* | |
+-----------------+---------------------------------+-------+
| ``\xhh`` | Character with hex value *hh* | (4,5) |
+-----------------+---------------------------------+-------+
.. index:: single: ASCII@ASCII
Notes:
(1)
Individual code units which form parts of a surrogate pair can be encoded using
this escape sequence.
(2)
Any Unicode character can be encoded this way, but characters outside the Basic
Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is
compiled to use 16-bit code units (the default).
(3)
As in Standard C, up to three octal digits are accepted.
(4)
Unlike in Standard C, exactly two hex digits are required.
(5)
In a string literal, hexadecimal and octal escapes denote the byte with the
given value; it is not necessary that the byte encodes a character in the source
character set. In a Unicode literal, these escapes denote a Unicode character
with the given value.
.. index:: single: unrecognized escape sequence
Unlike Standard C, all unrecognized escape sequences are left in the string
unchanged, i.e., *the backslash is left in the string*. (This behavior is
useful when debugging: if an escape sequence is mistyped, the resulting output
is more easily recognized as broken.) It is also important to note that the
escape sequences marked as "(Unicode only)" in the table above fall into the
category of unrecognized escapes for non-Unicode string literals.
When an ``'r'`` or ``'R'`` prefix is present, a character following a backslash
is included in the string without change, and *all backslashes are left in the
string*. For example, the string literal ``r"\n"`` consists of two characters:
a backslash and a lowercase ``'n'``. String quotes can be escaped with a
backslash, but the backslash remains in the string; for example, ``r"\""`` is a
valid string literal consisting of two characters: a backslash and a double
quote; ``r"\"`` is not a valid string literal (even a raw string cannot end in
an odd number of backslashes). Specifically, *a raw string cannot end in a
single backslash* (since the backslash would escape the following quote
character). Note also that a single backslash followed by a newline is
interpreted as those two characters as part of the string, *not* as a line
continuation.
When an ``'r'`` or ``'R'`` prefix is used in conjunction with a ``'u'`` or
``'U'`` prefix, then the ``\uXXXX`` and ``\UXXXXXXXX`` escape sequences are
processed while *all other backslashes are left in the string*. For example,
the string literal ``ur"\u0062\n"`` consists of three Unicode characters: 'LATIN
SMALL LETTER B', 'REVERSE SOLIDUS', and 'LATIN SMALL LETTER N'. Backslashes can
be escaped with a preceding backslash; however, both remain in the string. As a
result, ``\uXXXX`` escape sequences are only recognized when there are an odd
number of backslashes.
.. _string-catenation:
String literal concatenation
----------------------------
Multiple adjacent string literals (delimited by whitespace), possibly using
different quoting conventions, are allowed, and their meaning is the same as
their concatenation. Thus, ``"hello" 'world'`` is equivalent to
``"helloworld"``. This feature can be used to reduce the number of backslashes
needed, to split long strings conveniently across long lines, or even to add
comments to parts of strings, for example::
re.compile("[A-Za-z_]" # letter or underscore
"[A-Za-z0-9_]*" # letter, digit or underscore
)
Note that this feature is defined at the syntactical level, but implemented at
compile time. The '+' operator must be used to concatenate string expressions
at run time. Also note that literal concatenation can use different quoting
styles for each component (even mixing raw strings and triple quoted strings).
.. _numbers:
Numeric literals
----------------
.. index::
single: number
single: numeric literal
single: integer literal
single: plain integer literal
single: long integer literal
single: floating point literal
single: hexadecimal literal
single: binary literal
single: octal literal
single: decimal literal
single: imaginary literal
single: complex; literal
There are four types of numeric literals: plain integers, long integers,
floating point numbers, and imaginary numbers. There are no complex literals
(complex numbers can be formed by adding a real number and an imaginary number).
Note that numeric literals do not include a sign; a phrase like ``-1`` is
actually an expression composed of the unary operator '``-``' and the literal
``1``.
.. _integers:
Integer and long integer literals
---------------------------------
Integer and long integer literals are described by the following lexical
definitions:
.. productionlist::
longinteger: `integer` ("l" | "L")
integer: `decimalinteger` | `octinteger` | `hexinteger` | `bininteger`
decimalinteger: `nonzerodigit` `digit`* | "0"
octinteger: "0" ("o" | "O") `octdigit`+ | "0" `octdigit`+
hexinteger: "0" ("x" | "X") `hexdigit`+
bininteger: "0" ("b" | "B") `bindigit`+
nonzerodigit: "1"..."9"
octdigit: "0"..."7"
bindigit: "0" | "1"
hexdigit: `digit` | "a"..."f" | "A"..."F"
Although both lower case ``'l'`` and upper case ``'L'`` are allowed as suffix
for long integers, it is strongly recommended to always use ``'L'``, since the
letter ``'l'`` looks too much like the digit ``'1'``.
Plain integer literals that are above the largest representable plain integer
(e.g., 2147483647 when using 32-bit arithmetic) are accepted as if they were
long integers instead. [#]_ There is no limit for long integer literals apart
from what can be stored in available memory.
Some examples of plain integer literals (first row) and long integer literals
(second and third rows)::
7 2147483647 0177
3L 79228162514264337593543950336L 0377L 0x100000000L
79228162514264337593543950336 0xdeadbeef
.. _floating:
Floating point literals
-----------------------
Floating point literals are described by the following lexical definitions:
.. productionlist::
floatnumber: `pointfloat` | `exponentfloat`
pointfloat: [`intpart`] `fraction` | `intpart` "."
exponentfloat: (`intpart` | `pointfloat`) `exponent`
intpart: `digit`+
fraction: "." `digit`+
exponent: ("e" | "E") ["+" | "-"] `digit`+
Note that the integer and exponent parts of floating point numbers can look like
octal integers, but are interpreted using radix 10. For example, ``077e010`` is
legal, and denotes the same number as ``77e10``. The allowed range of floating
point literals is implementation-dependent. Some examples of floating point
literals::
3.14 10. .001 1e100 3.14e-10 0e0
Note that numeric literals do not include a sign; a phrase like ``-1`` is
actually an expression composed of the unary operator ``-`` and the literal
``1``.
.. _imaginary:
Imaginary literals
------------------
Imaginary literals are described by the following lexical definitions:
.. productionlist::
imagnumber: (`floatnumber` | `intpart`) ("j" | "J")
An imaginary literal yields a complex number with a real part of 0.0. Complex
numbers are represented as a pair of floating point numbers and have the same
restrictions on their range. To create a complex number with a nonzero real
part, add a floating point number to it, e.g., ``(3+4j)``. Some examples of
imaginary literals::
3.14j 10.j 10j .001j 1e100j 3.14e-10j
.. _operators:
Operators
=========
.. index:: single: operators
The following tokens are operators:
.. code-block:: none
+ - * ** / // %
<< >> & | ^ ~
< > <= >= == != <>
The comparison operators ``<>`` and ``!=`` are alternate spellings of the same
operator. ``!=`` is the preferred spelling; ``<>`` is obsolescent.
.. _delimiters:
Delimiters
==========
.. index:: single: delimiters
The following tokens serve as delimiters in the grammar:
.. code-block:: none
( ) [ ] { } @
, : . ` = ;
+= -= *= /= //= %=
&= |= ^= >>= <<= **=
The period can also occur in floating-point and imaginary literals. A sequence
of three periods has a special meaning as an ellipsis in slices. The second half
of the list, the augmented assignment operators, serve lexically as delimiters,
but also perform an operation.
The following printing ASCII characters have special meaning as part of other
tokens or are otherwise significant to the lexical analyzer:
.. code-block:: none
' " # \
.. index:: single: ASCII@ASCII
The following printing ASCII characters are not used in Python. Their
occurrence outside string literals and comments is an unconditional error:
.. code-block:: none
$ ?
.. rubric:: Footnotes
.. [#] In versions of Python prior to 2.4, octal and hexadecimal literals in the range
just above the largest representable plain integer but below the largest
unsigned 32-bit number (on a machine using 32-bit arithmetic), 4294967296, were
taken as the negative plain integer obtained by subtracting 4294967296 from
their unsigned value.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,125 @@
.. _top-level:
********************
Top-level components
********************
.. index:: single: interpreter
The Python interpreter can get its input from a number of sources: from a script
passed to it as standard input or as program argument, typed in interactively,
from a module source file, etc. This chapter gives the syntax used in these
cases.
.. _programs:
Complete Python programs
========================
.. index:: single: program
.. index::
module: sys
module: __main__
module: __builtin__
While a language specification need not prescribe how the language interpreter
is invoked, it is useful to have a notion of a complete Python program. A
complete Python program is executed in a minimally initialized environment: all
built-in and standard modules are available, but none have been initialized,
except for :mod:`sys` (various system services), :mod:`__builtin__` (built-in
functions, exceptions and ``None``) and :mod:`__main__`. The latter is used to
provide the local and global namespace for execution of the complete program.
The syntax for a complete Python program is that for file input, described in
the next section.
.. index::
single: interactive mode
module: __main__
The interpreter may also be invoked in interactive mode; in this case, it does
not read and execute a complete program but reads and executes one statement
(possibly compound) at a time. The initial environment is identical to that of
a complete program; each statement is executed in the namespace of
:mod:`__main__`.
.. index::
single: UNIX
single: command line
single: standard input
A complete program can be passed to the interpreter
in three forms: with the :option:`-c` *string* command line option, as a file
passed as the first command line argument, or as standard input. If the file
or standard input is a tty device, the interpreter enters interactive mode;
otherwise, it executes the file as a complete program.
.. _file-input:
File input
==========
All input read from non-interactive files has the same form:
.. productionlist::
file_input: (NEWLINE | `statement`)*
This syntax is used in the following situations:
* when parsing a complete Python program (from a file or from a string);
* when parsing a module;
* when parsing a string passed to the :keyword:`exec` statement;
.. _interactive:
Interactive input
=================
Input in interactive mode is parsed using the following grammar:
.. productionlist::
interactive_input: [`stmt_list`] NEWLINE | `compound_stmt` NEWLINE
Note that a (top-level) compound statement must be followed by a blank line in
interactive mode; this is needed to help the parser detect the end of the input.
.. _expression-input:
Expression input
================
.. index:: single: input
.. index:: builtin: eval
There are two forms of expression input. Both ignore leading whitespace. The
string argument to :func:`eval` must have the following form:
.. productionlist::
eval_input: `expression_list` NEWLINE*
.. index:: builtin: input
The input line read by :func:`input` must have the following form:
.. productionlist::
input_input: `expression_list` NEWLINE
.. index::
object: file
single: input; raw
single: raw input
builtin: raw_input
single: readline() (file method)
Note: to read 'raw' input line without interpretation, you can use the built-in
function :func:`raw_input` or the :meth:`readline` method of file objects.